Script 1 for Kitchel et al.Ā 2023 in prep taxonomic diversity
manuscript.
library(tidyverse)
library(sp)
library(raster)
#library(rgeos)
library(rgbif)
library(viridis)
library(gridExtra)
library(rasterVis)
library(concaveman)
library(sf)
library(cowplot)
library(data.table)
set.seed(1)
Pull in compiled and cleaned data from FishGlob downloaded on
November 28 2022 (V 1.5). This is typically compiled by Dr.Ā Aurore
Maureaud. This includes public and private data and therefore link
cannot be shared. However with editing you can run analyses for public
trawl surveys.
| AI |
Aleutian Islands |
Aleutian Islands |
National Oceanic and Atmospheric Administration |
USA |
Public |
DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and
OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| BITS-1 |
Baltic Sea Q1 |
Baltic Sea Quarter 1 |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| BITS-4 |
Baltic Sea Q4 |
Baltic Sea Quarter 4 |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| CHL |
Chile |
Chile |
Universidad de Concepción, Chile |
South America |
Requires data request |
Daniela Yepson daniela.yepsen@gmail.com and Luis Cubillos lucubillos@gmail.com |
Included |
| COL |
Colombia |
Colombian Caribbean |
Universidad Nacional de Colombia |
South America |
Requires data request |
Camilo B. Garcia cbgarciar@unal.edu.co |
Too few years |
| DFO-HS |
Hecate Strait |
Hecate Strait |
Department of Fisheries and Oceans |
Canada |
Public |
https://open.canada.ca/data/en/dataset/780a1c02-1f9c-4994-bc70-a0e9ef8e3968
and OceanAdapt: https://zenodo.org/records/8103080 |
Too few years |
| DFO-NF |
Newfoundland |
Newfoundland |
Department of Fisheries and Oceans |
Canada |
Requires data request |
Mariano Koen-Alonso mariano.koen-alonso@dfo-mpo.gc.ca |
Included |
| DFO-QCS |
Queen Charlotte Sound |
Queen Charlotte Sound |
Department of Fisheries and Oceans |
Canada |
Public |
https://open.canada.ca/data/en/dataset/a278d1af-d567-4964-a109-ae1e84cbd24a
and OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| DFO-SOG |
Strait of Georgia |
Straight of Georgia |
Department of Fisheries and Oceans |
Canada |
Public |
https://open.canada.ca/data/en/dataset/d880ba18-8790-41a2-bf73-e9247380759b
and OceanAdapt: https://zenodo.org/records/8103080 |
Too few years |
| DFO-WCHG |
West Coast Haida Gwaii |
West Coast Haida Gwaii |
Department of Fisheries and Oceans |
Canada |
Public |
https://open.canada.ca/data/en/dataset/5ee30758-b1d6-49fe-8c4e-5136f4b39ad1
and OceanAdapt: https://zenodo.org/records/8103080 |
Too few years |
| DFO-WCVI |
West Coast Vancouver Island |
West Coast Vancouver Island |
Department of Fisheries and Oceans |
Canada |
Public |
https://open.canada.ca/data/en/dataset/557e42ae-06fe-426d-8242-c3107670b1de
and OceanAdapt: https://zenodo.org/records/8103080 |
Too few years |
| EBS |
Eastern Bering Sea |
Eastern Bering Sea |
National Oceanic and Atmospheric Administration |
USA |
Public |
DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and
OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| EVHOE |
Bay of Biscay |
Bay of Biscay |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| FALK |
Falkland Islands |
Falkland Islands |
Falkland Islands Fisheries Department |
Southern Ocean |
Requires data request |
Alexander Arkhipkin aarkhipkin@fisheries.gov.fk and Jorge Ramos jeramos@fisheries.gov.fk |
Excluded after spatial temporal standardization in next script |
| FR-CGFS |
English Channel |
English Channel |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| GIN |
Guinea |
Guinea |
National Center of Fisheries Sciences of Boussoura, Conakry,
Republic of Guinea |
Africa |
Requires data request |
Mohammed Lamine Camara mlcamara.kennedy@gmail.com |
Inconsistent sampling through space and time |
| GMEX-Summer |
Gulf of Mexico Summer |
Gulf of Mexico Summer |
National Oceanic and Atmospheric Administration |
USA |
Public |
DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and
OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| GMEX-Fall |
Gulf of Mexico Fall |
Gulf of Mexico Fall |
National Oceanic and Atmospheric Administration |
USA |
Public |
DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and
OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| GOA |
Gulf of Alaska |
Gulf of Alaska |
National Oceanic and Atmospheric Administration |
USA |
Public |
DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and
OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| GRL-DE |
Greenland |
Greenland |
Thuenen Institute of Sea Fisheries |
Europe |
Requires data request |
Karl-Michael Werner karl-michael.werner@thuenen.de |
Included |
| GSL-N |
N Gulf of St.Ā Lawrence |
Northern Gulf of St.Ā Lawrence |
Department of Fisheries and Oceans |
Canada |
Public |
See OceanAdapt: https://zenodo.org/records/8103080 for specific DFO
links |
Included |
| GSL-S |
S Gulf of St.Ā Lawrence |
Southern Gulf of St.Ā Lawrence |
Department of Fisheries and Oceans |
Canada |
Public |
https://open.canada.ca/data/en/dataset/1989de32-bc5d-c696-879c-54d422438e64
and OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| ICE-GFS |
Iceland |
Iceland |
Marine and Freshwater Research Institute, Iceland |
Europe |
Requires data request |
Jón Sólmundsson jon.solmundsson@hafogvatn.is |
Included |
| IE-IGFS |
Irish Sea |
Irish Sea |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| IS-TAU |
Israel |
Israel |
Tel Aviv University |
Asia |
Requires data request |
Jonathan Belmaker jonathan.belmaker@gmail.com |
Too few years |
| IS-MOAG |
Israel |
Israel |
Israeli Ministry of Agriculture |
Asia |
Requires data request |
Oren Sonin orens@moag.gov.il and Dori Edelist blackreefs@gmail.com |
Inconsistent sampling through space and time |
| MEDITS |
Mediterranean |
Mediterranean |
Multiple |
Europe |
Requires data request |
Contact corresponding author for contacts |
Included |
| MRT |
Mauritania |
Mauritania |
Institut Mauritanien de Recherches Océanographiques et des Pêches,
Nouadhibou, Mauritania |
Africa |
Requires data request |
Beyah Meissa bmouldhabib@gmail.com |
Inconsistent sampling through space and time |
| NAM |
Namibia |
Namibia |
National Marine Information and Research Centre, Ministry of
Fisheries and Marine Resources, Namibia |
Africa |
Requires data request |
Johannes Kathena john.kathena@mfmr.gov.na |
Included |
| NEUS-Fall |
NE US Fall |
Northeast USA Fall |
National Oceanic and Atmospheric Administration |
USA |
Public |
DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and
OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| NEUS-Spring |
NE US Spring |
Northeast USA Spring |
National Oceanic and Atmospheric Administration |
USA |
Public |
DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and
OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| NIGFS-1 |
N Ireland Q1 |
North Ireland Quarter 1 |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| NIGFS-4 |
N Ireland Q4 |
North Ireland Quarter 4 |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| Nor-BTS-3 |
Barents Sea Norway Q3 |
Barents Sea Norway Q3 |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| NS-IBTS-1 |
N Sea Q1 |
North Sea Quarter 1 |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| NS-IBTS-3 |
N Sea Q3 |
North Sea Quarter 3 |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| NZ-CHAT |
Chatham Rise NZ |
Chatham Rise New Zealand |
National Institute of Water and Atmospheric Research Limited, New
Zealand |
Oceania |
Requires data request |
Richard OāDriscoll richard.odriscoll@niwa.co.nz and Fabrice Stephenson fabrice.stephenson@waikato.ac.nz |
Included |
| NZ-ECSI |
E Coast S Island NZ |
East Coast South Island New Zealand |
National Institute of Water and Atmospheric Research Limited, New
Zealand |
Oceania |
Requires data request |
Richard OāDriscoll richard.odriscoll@niwa.co.nz and Fabrice Stephenson fabrice.stephenson@waikato.ac.nz |
Included |
| NZ-SUBA |
Sub-Antarctic NZ |
Sub-Antarctic New Zealand |
National Institute of Water and Atmospheric Research Limited, New
Zealand |
Oceania |
Requires data request |
Richard OāDriscoll richard.odriscoll@niwa.co.nz and Fabrice Stephenson fabrice.stephenson@waikato.ac.nz |
Included |
| NZ-WCSI |
W Coast S Island NZ |
West Coast South Island New Zealand |
National Institute of Water and Atmospheric Research Limited, New
Zealand |
Oceania |
Requires data request |
Richard OāDriscoll richard.odriscoll@niwa.co.nz and Fabrice Stephenson fabrice.stephenson@waikato.ac.nz |
Included |
| PT-IBTS |
Portugal |
Portugal |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| ROCKALL |
Rockall Plateau |
Rockall Plateau |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| S-GEORG |
S Georgia |
South Georgia |
British Antarctic Survey |
Southern Ocean |
Requires data request |
Mark Belchier mark.belchier@gov.gs and Martin Collins macol@bas.ac.uk |
Included |
| SCS-Fall |
Scotian Shelf Fall |
Scotian Shelf Summer |
Department of Fisheries and Oceans |
Canada |
Public |
https://open.canada.ca/data/en/dataset/1366e1f1-e2c8-4905-89ae-e10f1be0a164
and OceanAdapt: https://zenodo.org/records/8103080 |
Too few years |
| SCS-SPRING |
Scotian Shelf Spring |
Scotian Shelf Spring |
Department of Fisheries and Oceans |
Canada |
Public |
https://open.canada.ca/data/en/dataset/fecf045a-95a2-4b69-8a40-818649a62716
and OceanAdapt: https://zenodo.org/records/8103080 |
Too much data loss after spatial temporal standardization |
| SCS-SUMMER |
Scotian Shelf Summer |
Scotian Shelf Summer |
Department of Fisheries and Oceans |
Canada |
Public |
https://open.canada.ca/data/en/dataset/1366e1f1-e2c8-4905-89ae-e10f1be0a164
and OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| SEUS-fall |
SE US Fall |
Southeast USA Fall |
National Oceanic and Atmospheric Administration |
USA |
Public |
DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and
OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| SEUS-spring |
SE US Spring |
Southeast USA Spring |
National Oceanic and Atmospheric Administration |
USA |
Public |
DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and
OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| SEUS-summer |
SE US Summer |
Southeast USA Summer |
National Oceanic and Atmospheric Administration |
USA |
Public |
DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and
OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| SWC-IBTS-1 |
Scotland Shelf Sea Q1 |
Scotland Shelf Sea Quarter 1 |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| SWC-IBTS-4 |
Scotland Shelf Sea Q4 |
Scotland Shelf Sea Quarter 4 |
International Council for the Exploration of the Sea |
Europe |
Public |
https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx |
Included |
| WBLS |
Western Black Sea |
Western Black Sea |
Institute of Fish Resources, Bulgaria |
Europe |
Requires data request |
Elitsa Petrova (elitssa@yahoo.com), Feriha Tserkova & Vesselina
Mihneva |
Too few years |
| WCANN |
W Coast US |
West Coast USA |
National Oceanic and Atmospheric Administration |
USA |
Public |
DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and
OceanAdapt: https://zenodo.org/records/8103080 |
Included |
| ZAF-ATL |
Atlantic Ocean ZA |
Atlantic Ocean South Africa |
Department of Forestry, Fisheries and the Environment, South
Africa |
Africa |
Requires data request |
Tracey Fairweather traceyf@daff.gov.za |
Included |
| ZAF-IND |
Indian Ocean ZA |
Indian Ocean South Africa |
Department of Forestry, Fisheries and the Environment, South
Africa |
Africa |
Requires data request |
Tracey Fairweather traceyf@daff.gov.za |
Included |
FishGlob_1.5 <- fread(here::here("data","FISHGLOB_v1.5_clean.csv"))
|--------------------------------------------------|
|==================================================|
|--------------------------------------------------|
|==================================================|
This version of FishGlob leaves out seasons for GMEX, fix here
#add season to GMEX to survey unit
FishGlob_1.5[survey == "GMEX", survey_unit := paste0(survey,"-",season)]
Also adding in seasons for NIGFS
#add season to GMEX to survey unit
FishGlob_1.5[survey == "NIGFS", survey_unit := paste0(survey,"-",quarter)]
ZAF (South Africa) has distinct Atlantic and Indian surveys (split at
~20.01Ė E, Cape Agulhas)
FishGlob_1.5[survey == "ZAF" & longitude <20.01, survey_unit := "ZAF-ATL"][survey == "ZAF" & longitude >= 20.01, survey_unit := "ZAF-IND"]
Region names
sort(unique(FishGlob_1.5[,survey_unit]))
[1] "AI" "BITS-1" "BITS-4" "CHL" "COL" "DFO-HS" "DFO-NF" "DFO-QCS" "DFO-SOG" "DFO-WCHG" "DFO-WCVI" "EBS"
[13] "EVHOE" "FALK" "FR-CGFS" "GIN" "GMEX-Fall" "GMEX-Summer" "GOA" "GRL-DE" "GSL-N" "GSL-S" "ICE-GFS" "IE-IGFS"
[25] "IS-MOAG" "IS-TAU" "MEDITS" "MRT" "NAM" "NEUS-Fall" "NEUS-Spring" "NIGFS-1" "NIGFS-4" "Nor-BTS" "NS-IBTS-1" "NS-IBTS-3"
[37] "NZ-CHAT" "NZ-ECSI" "NZ-SUBA" "NZ-WCSI" "PT-IBTS" "ROCKALL" "S-GEORG" "SCS-FALL" "SCS-SPRING" "SCS-SUMMER" "SEUS-fall" "SEUS-spring"
[49] "SEUS-summer" "SWC-IBTS-1" "SWC-IBTS-4" "WBLS" "WCANN" "WCTRI" "ZAF" "ZAF-ATL" "ZAF-IND"
##Data Replacements ####Greenland (version in FishGlob 1.5 is missing
lengths and therefore biomass values) This version was obtained directly
from Karl-Michael Werner karl-michael.werner@thuenen.de
who now manages the Greenland survey September 2023. He is based in
Germany.
#greenland <-
####Norway Prepped by Laurene Pecuchet (U Trƶmso, Norway) September
2023 to replace whatās in FishGlob 1.5 because IMR āare quite concerned
that FishGlob, and other studies, have been using aāflawedā
multi-surveys dataset that is available in NMDC (data portal of IMR).
Turns out that this dataset was put publicly by miscommunication on NMDC
after one published paper in Scientific Reports, and I think they only
realized the existence of this dataset just the last year as some papers
are coming out using it (especially the one from Cesc Gordo-Vilaseca in
PNAS https://www.pnas.org/doi/10.1073/pnas.2120869120). They
are now trying to make some damage controls to make sure that this
dataset is not used ever again in the future, but that cleanded and
standardised datasets of the Barents Sea survey that are publicly
available in NMDC are used instead of.
September 14: From Laurene, āI send you in attachment the ānewā IMR
survey formatted for Fishglob. I have done some small check of the
dataset, and so far everything looks good, but I didnāt do a deep check
yet, but I donāt see why there should be any problems with itā¦.For your
study, I think it is also important that you know that there has been
some inconsistencies in taxonomic descriptions in the Barents Sea so
that some species should be considered at the genus level instead of for
biodiversity analysis, I send you in attach an excel (Barents Sea Fish
Reference List.csv) file that summarize which species might be a
misidentification and which one should be considered and merged.ā All of
these files now live in ādata/Norway_Sep2023ā
Helpful guidance from here: https://www.hi.no/en/hi/nettrapporter/rapport-fra-havforskningen-en-2021-15
- ā2.2.5 - Recommended adjustments to the output before analysis
Eelpouts and liparids. When combing years, we recommend that all records
of eelpouts (Zoarcidae) are pooled to the family level, because they are
notoriously difficult to identify (see Appendix 3). The same apply to
liparids (Liparidae). If species level data of these families are used,
consider excluding data from 2004-2006/2007. These years the staff on
some of the Norwegian vessels were inexperienced, and proper
identification keys for arctic species were lacking (compare for
instance catches of Lycodes frigidus and Lycodes eudipleurostictus in
the first years to the later years, Appendix 3). If species level data
of these families are used, records to family levels should be removed
or else these will be treated as a separate species in the further
analysis of the data. Both Zoarcidae and Liparidae have unresolved
taxonomy for some genera, therefore we have chosen to pool all liparids
of the genus Careproctus and all eelpouts of the genus Gymnelus in the
output. Sebastes. The columnā Sebastes spp.ā contains mainly juvenile
redfish. Small specimens are very difficult to identify so the protocol
is to identify only individuals larger than 10 cm to the species level.
Before analysis, all redfish ( S . mentella , S. norvegicus, S.
viviparus and Sebastes spp .) should be pooled, or Sebastes spp. should
be removed ā if not it will be treated as a separate species in the
analysis . Records in Appendix 2. The records of the S. viviparus west
of Svalbard(Spitsbergen) are unreliable and should be removed if
Sebastes data are kept at the species level (Appendix 2). Species
verified for the Barents Sea, but outliers in terms the normal depth
range, distribution area within the Barents Sea, size etc. were coded as
questionable in the data base (Appendix 2) and should be removed before
analysis. Consider also removing pelagic species (e.g.Ā capelin and
herring), as these are poorly sampled by the bottom trawl. The data
should be standardised with towing distance before analysis.ā
Therefore, we will: - Remove all records of eelpouts and liparids
(Family = Zoarcidae or Liparidae) (as we only include species IDād to
species) - Remove redfish (Genus = Sebastes)
#load Norwegian data
load(here::here("data","Norway_Sep2023","NOR-BTS_clean.RData"))
norway_clean <- data.table(data)
#remove observations without dates
norway_clean <- norway_clean[complete.cases(norway_clean[,.(month)]),]
#remove species records in accordance with recommendation from HI
norway_clean <- norway_clean[!(family %in% c("Zoarcidae","Liparidae") | genus == "Sebastes"),]
#some column names don't match fishglob (fishglob = num, num_h, num_cpue, wgt, wgt_h, wgt_cpue; norway = num, num_cpue (number of ind./hour), num_cpua (number of ind./km2), wgt, wgt_cpue (kg/min), wgt_cpua(kg/km2) )
#also, some column units in the readme are in correct. Therefore, I will generate _cpue and _h values here
# we will need to check and rename columns
setnames(norway_clean, c("haul_dur"), c("haul_dur_m"))
norway_clean[,haul_dur := haul_dur_m/60] #haul duration currently in minutes, need hours
norway_clean[,num_h := num/haul_dur][,num_cpue := num/area_swept][,wgt_h := wgt/haul_dur][,wgt_cpue := wgt/area_swept]
#change some columns to numeric
cols = c("month","day")
norway_clean[,(cols) := lapply(.SD,as.numeric),.SDcols = cols]
#also, delete source and timestamp
fishglob_colnames <- colnames(FishGlob_1.5)
norway_clean <- norway_clean[,..fishglob_colnames]
norway_clean[survey == "Nor-BTS" & month %in% c(1:6), survey_unit := "Nor-BTS-1"][survey == "Nor-BTS" & month %in% c(7:12), survey_unit := "Nor-BTS-3"]
#Overlap between IBTS and Nor-BTS surveys below 62Ėlatitude, so delete all hauls that occur below 62Ėlatitude
norway_clean <- norway_clean[latitude >= 62,]
Delete Greenland and Norway
FishGlob_1.5 <- FishGlob_1.5[!(survey %in% c("Nor-BTS"
#,
#"GRL-DE" #ignore greenland for now...
))]
Add in updated Greenland and Norway data
FishGlob_1.5 <-rbind(FishGlob_1.5,norway_clean)
#FishGlob_1.5 <-rbind(FishGlob_1.5,greenland)
##Preliminary Data Cuts ###Specific Regional Changes Before Cutting
to 10 years only
GSL - North: we have data 1980-2019, but gear changes in
2004/2005, so letās use later portion (more consistent months of
sampling; 2005-2019; 15 years) - South: we have data 1970-2019, but
gear/vessel changes in 1985 and again in 1992, so again letās use later
portion (1992-2019; 27 years) - See this github
issue
#identify haul_ids of hauls we should remove from GSL surveys
haul_ids_to_remove_GSL <- unique(FishGlob_1.5[(survey == "GSL-N" & year < 2005)|(survey == "GSL-S" & year < 1992),haul_id])
FishGlob_1.5 <- FishGlob_1.5[!(haul_id %in% haul_ids_to_remove_GSL),] #remove hauls before consistent gear/vessel was used
SGEORG - From Martin Collins, āMost surveys were focused on
demersal fish on the South Georgia shelf (< 350 m), but surveys in
2003, 2010 and 2019 had some deeper trawls. The deeper trawls caught
very different fish, so are unlikely to be of use to a long-term
analysis, but I have left them in.ā
-Delete all trawls deeper than 350 M
haul_ids_to_remove_SGEORG <- unique(FishGlob_1.5[(survey == "SGEORG" & depth >350),haul_id])
FishGlob_1.5 <- FishGlob_1.5[!(haul_id %in% haul_ids_to_remove_SGEORG),] #remove hauls before consistent gear/vessel was used
NZ-CHAT -bump december observations to next year because
observations occur in 12,1,2
#bump observations forward
FishGlob_1.5[survey == "NZ-CHAT" & month == 12, year := year+1, ]
###Because time is an essential component of these analyses, we will
get rid of any survey x season combinations that are not sampled for at
least 10 years
#new row for total number of years sampled
FishGlob_1.5[,years_sampled := length(unique(year)),.(survey_unit)]
summary(FishGlob_1.5$years_sampled) #ranges from 2 (DFO Straight of Georgia) to 57 (Northeast US)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 23.00 29.00 30.89 37.00 57.00
View(unique(FishGlob_1.5[,.(survey_unit, years_sampled)]))
#statistics about full dataset
nrow(FishGlob_1.5)
[1] 4172996
length(unique(FishGlob_1.5[,survey]))
[1] 45
length(unique(FishGlob_1.5[,survey_unit]))
[1] 57
#remove observations for any regions x season combinations sampled less than 10 times
FishGlob.10year <- FishGlob_1.5[years_sampled >= 10,]
#statistics about reduced 10 year dataset
nrow(FishGlob.10year)
[1] 4089112
length(unique(FishGlob.10year[,survey]))
[1] 38
length(unique(FishGlob.10year[,as.character(survey_unit)]))
[1] 48
#remove full database
rm(FishGlob_1.5)
###For taxonomic analyses, resolution to species is required.
Therefore, we will exclude any observations not resolved to species.
#month a number
FishGlob.10year[,month := as.numeric(month)]
FishGlob.10year.spp <- FishGlob.10year[rank %in% c("Species", "Subspecies"),] #3869384 total observations
#remove full species database
rm(FishGlob.10year)
#vector with all survey names
all_survey_units <- sort(unique(FishGlob.10year.spp[,survey_unit]))
#calculate # species per year
FishGlob.10year.spp_survey_year <- unique(FishGlob.10year.spp[,.(survey_unit, year, accepted_name)])
FishGlob.10year.spp_survey_year[,spp_count_survey_year := uniqueN(accepted_name),.(survey_unit, year)]
FishGlob.10year.spp_survey_year.r <-unique(FishGlob.10year.spp_survey_year[,.(survey_unit, year, spp_count_survey_year)])
nrow(FishGlob.10year.spp_survey_year.r)
[1] 1215
#calculate # hauls per year
FishGlob.10year.spp_haulid_year <- unique(FishGlob.10year.spp[,.(survey_unit, year, haul_id)])
FishGlob.10year.spp_haulid_year[,haulid_count_survey_year := uniqueN(haul_id),.(survey_unit, year)]
FishGlob.10year.spp_haulid_year.r <-unique(FishGlob.10year.spp_haulid_year[,.(survey_unit, year, haulid_count_survey_year)])
nrow(FishGlob.10year.spp_haulid_year.r)
[1] 1215
##Visually Inspect Distribution of Data Through Time and Space
##Spatial and Temporal Patterns in All Trawl Surveys
Letās look at the number of hauls per year/month and year/quarter and
year/season visually
#unique survey, survey_unit, year, month, quarter, season, haul_id, lat, lon
FishGlob.10year.uniquehauls <- unique(FishGlob.10year.spp[,.(survey, survey_unit, year,month,quarter,season,haul_id, latitude, longitude,haul_dur)])
#add column with adjusted longitude for few surveys that cross dateline (NZ-CHAT and AI)
FishGlob.10year.uniquehauls[,longitude_adj := ifelse((survey_unit %in% c("AI","NZ-CHAT") & longitude > 0),longitude-360,longitude)]
FishGlob.10year.uniquehauls[,haul_counts_per_survey_season_month :=uniqueN(haul_id),.(survey, month, season)][, #count # hauls per survey, season, and month
haul_counts_per_survey_quarter_month :=uniqueN(haul_id),.(survey, month, quarter)][,#count # hauls per survey, month, and quarter
total_hauls_survey :=uniqueN(haul_id),.(survey)][,#count # hauls per survey in all years
#proportion of hauls for each survey, season, and month divided by total # over all years
haul_proportion_survey_season :=haul_counts_per_survey_season_month/total_hauls_survey][,
#proportion of hauls for each survey, quarter, and month divided by total # over all years
haul_proportion_survey_quarter :=haul_counts_per_survey_quarter_month/total_hauls_survey][,
haul_count_per_survey_year_month :=uniqueN(haul_id),.(year, survey_unit, month)][, #count # hauls per survey unit, year, and month
total_hauls_survey_year := uniqueN(haul_id),.(survey_unit,year)][, #count total # hauls per survey unit and year
#proportion of hauls for each survey unit and month divided by total # hauls within a survey unit within a year
haul_proportion_month_yearly := haul_count_per_survey_year_month/total_hauls_survey_year][,
haul_count_per_survey_year_quarter :=uniqueN(haul_id),.(year, survey_unit, quarter)][, #count # hauls per survey unit, year, and month
#proportion of hauls for each survey unit and month divided by total # hauls within a survey unit within a year
haul_proportion_quarter_yearly := haul_count_per_survey_year_quarter/total_hauls_survey_year]
FishGlob.10year.uniquehauls.season <- unique(FishGlob.10year.uniquehauls[,.(survey, survey_unit, month, season, haul_counts_per_survey_season_month,total_hauls_survey, haul_proportion_survey_season)]) #relative sampling by season across all years
FishGlob.10year.uniquehauls.quarter <- unique(FishGlob.10year.uniquehauls[,.(survey,survey_unit , month, quarter, haul_counts_per_survey_quarter_month,total_hauls_survey, haul_proportion_survey_quarter)]) #relative sampling by quarter across all years
FishGlob.10year.uniquehauls.annual.month <- unique(FishGlob.10year.uniquehauls[,.(survey, year, survey_unit, month, haul_count_per_survey_year_month,total_hauls_survey_year,haul_proportion_month_yearly)]) #relative sampling by month within years
FishGlob.10year.uniquehauls.annual.quarter <- unique(FishGlob.10year.uniquehauls[,.(survey, year, survey_unit, quarter, haul_count_per_survey_year_quarter,total_hauls_survey_year,haul_proportion_quarter_yearly)]) #relative sampling by month within years
#how does #hauls vary with season and month?
survey_season_month_hauls <- ggplot(FishGlob.10year.uniquehauls.season) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
facet_wrap(~survey,scales = "free_y") +
theme_classic()
ggsave(survey_season_month_hauls, filename = "survey_season_month_hauls.pdf",path = here::here("figures","view_data"), height = 5, width = 15, units = "in")
#how does #hauls vary with quarter and month?
survey_quarter_month_hauls <- ggplot(FishGlob.10year.uniquehauls.quarter) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
facet_wrap(~survey,scales = "free_y") +
theme_classic()
ggsave(survey_quarter_month_hauls, filename = "survey_quarter_month_hauls.pdf",path = here::here("figures","view_data"), height = 5, width = 15, units = "in")
#how does #hauls vary with year and month?
year_survey_month_hauls <- ggplot(FishGlob.10year.uniquehauls.annual.month) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
facet_wrap(~survey_unit,scales = "free_y") +
theme_classic()
ggsave(year_survey_month_hauls, filename = "year_survey_month_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")
ggsave(year_survey_month_hauls, filename = "year_survey_month_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")
#how does #hauls vary with year and month?
year_survey_quarter_hauls <- ggplot(FishGlob.10year.uniquehauls.annual.quarter) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
facet_wrap(~survey_unit,scales = "free_y") +
theme_classic()
ggsave(year_survey_quarter_hauls, filename = "year_survey_quarter_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")
ggsave(year_survey_quarter_hauls, filename = "year_survey_quarter_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")
Now, letās look at how location of sampling varies by month of
sampling and year of sampling
location_by_year <- ggplot(FishGlob.10year.uniquehauls) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
facet_wrap(~survey_unit, scales = "free") +
theme_classic()
ggsave(location_by_year, filename = "location_by_year.pdf",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")
ggsave(location_by_year, filename = "location_by_year.jpg",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")
ggsave(location_by_year, filename = "location_by_year.eps",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")
(location_by_month <- ggplot(FishGlob.10year.uniquehauls) +
geom_point(aes(x = longitude_adj, y = latitude, color = as.numeric(month)), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
facet_wrap(~survey_unit, scales = "free") +
theme_classic())
ggsave(location_by_month, filename = "location_by_month.pdf",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")
ggsave(location_by_month, filename = "location_by_month.jpg",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")
ggsave(location_by_month, filename = "location_by_month.eps",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")


##Region Specific Data Processing
-Fredston et al.Ā 2022 Nature and Batt et al.Ā 2017 Ecology Letters
informed North American data processing -Personal communication with
Aurore Maureaud and Laurene Pecuchet re: work by A. Maureaud, L.
Pecuchet and R. Frelat and the supplementary material for Maureaud et
al.Ā 2019 Proceedings of the Royal Society B: Biological Sciences
informed European data processing -Additional data processing informed
by data itself, and by FishGlob pdf summary documents -limit to max 3
months for each survey unit, representative of a āseasonā (exception =
West Coast USA where all 4 months sampled consistently)
####āAIā
ggplot(FishGlob.10year.uniquehauls.season[survey == "AI",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey == "AI",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey == "AI",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey == "AI",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey == "AI",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey == "AI",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "AI",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "AI",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

- Most hauls in 6,7,8
- Seemingly consistent spatial distribution through time
- No dramatic changes in spp richness
ai_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "AI" & month %in% c(6:8),haul_id])
####BITS (We have two surveys for BITS, quarter 1 and quarter 4) BITS
1
From Fredston et al.Ā 2023, every year after 2000 has >400 hauls
and most of the earlier years are <50
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "BITS-1",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "BITS-1",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "BITS-1",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "BITS-1",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-1",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-1",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "BITS-1",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "BITS-1",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep both months (2,3) -Seemingly consistent spatial distribution
through time -Consistent # of species and # hauls after 2000
bits1_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "BITS-1" & month %in% c(2,3) & year > 2000,haul_id])
BITS4
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "BITS-4",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "BITS-4",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "BITS-4",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "BITS-4",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-4",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-4",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "BITS-4",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "BITS-4",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep (10,11,12) -Start in 2000 (starts in 1996, but gap in 1997 and
1998, and 1996 all in December; also spp richness in first survey very
low; consistent # of hauls after 2000) -Seemingly consistent spatial
distribution through time
bits4_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "BITS-4" & month %in% c(10:12) & year > 2000,haul_id])
####CHL (Chile)
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "CHL",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "CHL",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "CHL",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "CHL",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "CHL",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "CHL",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "CHL",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "CHL",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep (7,8,9) -Seemingly consistent spatial distribution through time
-No major changes in spp richness through time -No major changes in #
hauls through time
chl_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "CHL" & month %in% c(7:9),haul_id])
####DFO-NF
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "DFO-NF",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "DFO-NF",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "DFO-NF",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "DFO-NF",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-NF",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-NF",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "DFO-NF",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "DFO-NF",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep (10,11,12) -Seemingly consistent spatial distribution through
time -No major changes in spp richness through time -No major changes in
haulid through time
dfo_nf_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "DFO-NF" & month %in% c(10:12),haul_id])
####DFO-QCS
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "DFO-QCS",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "DFO-QCS",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "DFO-QCS",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "DFO-QCS",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-QCS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-QCS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "DFO-QCS",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "DFO-QCS",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep (7,8) -Seemingly consistent spatial distribution through time
-No major changes in richness over time -No major changes in #hauls
dfo_qcs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "DFO-QCS" & month %in% c(7,8),haul_id])
####EBS
-Sampling years prior to 1984 (data begin in 1982) were excluded from
analysis due to large apparent increases in the number of species
recorded in the first two years. (Batt et al.Ā 2017)
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "EBS",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "EBS",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "EBS",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "EBS",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "EBS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "EBS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "EBS",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "EBS",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep (6,7,8) -Seemingly consistent spatial distribution through time
-Per Batt et al.Ā 2017, limit to >= 1984 -No clear changes in richness
through time -No clear changes in # hauls through time
ebs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "EBS" & month %in% c(6,7,8) & year >= 1984,haul_id])
####EVHOE
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "EVHOE",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "EVHOE",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "EVHOE",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "EVHOE",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "EVHOE",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "EVHOE",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "EVHOE",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "EVHOE",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep (10,11,12) -Seemingly consistent spatial distribution through
time -Very low sampling in 2017 (and also low richness), exclude this
year
evhoe_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "EVHOE" & month %in% c(10,11,12) & year != 2017 ,haul_id])
####FALK (excluded from final dataset)
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "FALK",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "FALK",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "FALK",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "FALK",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "FALK",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "FALK",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "FALK",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "FALK",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep February (2) only from 2004 onward (most consistent sampling)
-Inconsistent spatial distribution through time, but this will be fixed
in next step with spatial standardization
falk_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "FALK" & month %in% c(2) & year >= 2004, haul_id])
####FR-CGFS
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "FR-CGFS",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "FR-CGFS",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "FR-CGFS",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "FR-CGFS",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "FR-CGFS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "FR-CGFS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "FR-CGFS",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "FR-CGFS",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep 9,10,11 -Consistent spatial distribution through time
-Seemingly consistent richness through time -Seeemingly consistent
#hauls through time
fr_cgfs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "FR-CGFS" & month %in% c(9,10,11), haul_id])
####GIN (excluded from final dataset)
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GIN",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GIN",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GIN",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GIN",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GIN",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GIN",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GIN",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GIN",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Exclude this region, no consistent sampling through time
gin_hauls_keep <- NULL
####GMEX -In the Gulf of Mexico, we restricted our analysis to data
from 1984 - 2000 (full range 1982-2014); if all years had been used, the
number of sites sampled in at least 85% of years would drop from 39 to
13. (Batt et al.Ā 2017)
GMEX Fall
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GMEX-Fall",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GMEX-Fall",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GMEX-Fall",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GMEX-Fall",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Fall",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Fall",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GMEX-Fall",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GMEX-Fall",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep 9,10,11 -Inconsistent spatial distribution through time, will
restrict to <-87.5 longitude -Seemingly consistent richness through
time -Seeemingly consistent #hauls through time
gmex_fall_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Fall" & month %in% c(9,10,11) & longitude_adj < -87.5, haul_id])
GMEX Summer
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GMEX-Summer",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GMEX-Summer",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GMEX-Summer",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GMEX-Summer",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Summer",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Summer",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GMEX-Summer",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GMEX-Summer",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep months 5,6,7 -In consistent spatial distribution through time,
but this will be fixed in spatial standardization step -Seemingly
consistent richness before 2008 and 2008 onward through time -Seeemingly
consistent #hauls through time -Jump from 2007 to 2008, when spatial
footprint increases, so I will only use data from before 2008
gmex_summer_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Summer" & month %in% c(5,6,7) & year <2008, haul_id])
####GOA
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GOA",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GOA",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GOA",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GOA",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GOA",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GOA",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GOA",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GOA",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep months 6,7,8 -Consistent spatial distribution through time
-Seemingly consistent richness -Seemingly consistent #hauls through
time
goa_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GOA" & month %in% c(6,7,8), haul_id])
####GRL-DE -From Beukhof et al.Ā 2019, all surveys in October and
November
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GRL-DE",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GRL-DE",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GRL-DE",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GRL-DE",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GRL-DE",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GRL-DE",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GRL-DE",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GRL-DE",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-No months in data set, but according to Beukhof et al.Ā 2019, all
sampling in October and November so keep all -Consistent spatial
distribution through time -Seemingly consistent richness -# of hauls
drops between 1991 and 1992, and both 1992 and 2017 so limit to years
between (1993-2016)
grl_de_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GRL-DE" & year %in% c(1993:2016), haul_id])
####GSL
GSL-N
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GSL-N",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GSL-N",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GSL-N",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GSL-N",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-N",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-N",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GSL-N",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GSL-N",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep 6,7,8 -Consistent spatial distribution through time -Seemingly
consistent richness -# of hauls in 2005 is higher, so start in 2006
gsl_n_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GSL-N" & year > 2005, haul_id])
GSL-S
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GSL-S",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GSL-S",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GSL-S",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GSL-S",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-S",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-S",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GSL-S",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GSL-S",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep 8,9,10 -Consistent spatial distribution through time -Seemingly
consistent richness -Seemingly consistent number of hauls
gsl_s_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GSL-S" & month %in% c(8:10), haul_id])
####ICE-GFS
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ICE-GFS",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ICE-GFS",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ICE-GFS",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ICE-GFS",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ICE-GFS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ICE-GFS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ICE-GFS",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ICE-GFS",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep 2,3,4 -Consistent spatial distribution through time -Seemingly
consistent richness -Seemingly consistent number of hauls
ice_gfs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ICE-GFS" & month %in% c(2:4), haul_id])
####IE-IGFS
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "IE-IGFS",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "IE-IGFS",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "IE-IGFS",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "IE-IGFS",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "IE-IGFS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "IE-IGFS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "IE-IGFS",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "IE-IGFS",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep 10,11,12 -Consistent spatial distribution through time after
2004 (sampled far east in 2003 and 2004) -Seemingly consistent richness
-Seemingly consistent number of hauls
ie_igfs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "IE-IGFS" & month %in% c(10:12) & year > 2004, haul_id])
####IS-MOAG (excluded from final dataset)
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "IS-MOAG",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "IS-MOAG",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "IS-MOAG",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "IS-MOAG",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "IS-MOAG",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "IS-MOAG",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "IS-MOAG",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "IS-MOAG",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Sampling too scattered over time, excluding
is_moag_hauls_keep <- NULL
####MEDITS
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "MEDITS",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "MEDITS",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "MEDITS",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "MEDITS",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "MEDITS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "MEDITS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "MEDITS",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "MEDITS",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep all surveys in quarter 2 -Consistent spatial distribution
through time -Seemingly consistent richness -Seemingly consistent number
of hauls
medits_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "MEDITS", haul_id])
####MRT (excluded from final dataset)
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "MRT",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "MRT",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "MRT",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "MRT",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "MRT",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "MRT",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "MRT",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "MRT",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Sampling inconsistent, exclude completely
mrt_hauls_keep <- NULL
####NAM
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NAM",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NAM",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NAM",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NAM",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NAM",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NAM",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NAM",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NAM",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep surveys in 1 and 2 (most consistently sampled) -Consistent
spatial distribution through time -Seemingly consistent richness except
for 1998 (exclude) -Seemingly consistent number of hauls except for 1998
(exclude)
nam_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NAM" & month %in% c(1,2) & year != 1998, haul_id])
####NEUS
NEUS Spring
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NEUS-Spring",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NEUS-Spring",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NEUS-Spring",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NEUS-Spring",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Spring",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Spring",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NEUS-Spring",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NEUS-Spring",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep 3,4,5 months -Inconsistent spatial distribution through time,
but should be caught in standardization step -Seemingly consistent
richness (especially after 87, should be fixed with standardization
step) -Seemingly consistent number of hauls (especially after 81, should
be fixed with standardization step)
#calculate wgt_cpue (km^2 avg from sean Lucey) and wgt_h (all biomass values calibrated to standard pre 2009 30 minute tow)
FishGlob.10year.spp[survey == "NEUS", wgt_h := wgt/0.5][survey == "NEUS", wgt_cpue := wgt/0.0384][survey == "NEUS", num_h := num/0.5][survey == "NEUS", num_cpue := num/0.0384]
#also, for northeast, we are going to delete any hauls before 2009 that are outside of +/- 5 minutes of 30 minutes and 2009 forward that are outside of +/- 5 minutes of 20 minutes
neus_spring_keep <- unique(FishGlob.10year.uniquehauls[((survey_unit == "NEUS-Spring" & month %in% c(3:5) & year < 2009 & (haul_dur > 0.42 & haul_dur < 0.58)) |
(survey_unit == "NEUS-Spring" & month %in% c(3:5) & year >= 2009 & (haul_dur > 0.25 & haul_dur < 0.42))), haul_id])
NEUS Fall
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NEUS-Fall",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NEUS-Fall",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NEUS-Fall",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NEUS-Fall",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Fall",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Fall",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NEUS-Fall",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NEUS-Fall",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep 9,10,11 months -Inconsistent spatial distribution through time,
but should be caught in standardization step -Seemingly consistent
richness (especially after 84, should be fixed with standardization
step) -Seemingly consistent number of hauls (especially after 85, should
be fixed with standardization step)
#also, for northeast, we are going to delete any hauls before 2009 that are outside of +/- 5 minutes of 30 minutes and 2009 forward that are outside of +/- 5 minutes of 20 minutes
neus_fall_keep <- unique(FishGlob.10year.uniquehauls[((survey_unit == "NEUS-Fall" & month %in% c(9,10,11) & year < 2009 & (haul_dur > 0.42 & haul_dur < 0.58)) |
(survey_unit == "NEUS-Fall" & month %in% c(9,10,11) & year >= 2009 & (haul_dur > 0.25 & haul_dur < 0.42))), haul_id])
####NIGFS Northern Ireland
Spring Northern Ireland (quarter 1)
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NIGFS-1",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NIGFS-1",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NIGFS-1",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NIGFS-1",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-1",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-1",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NIGFS-1",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NIGFS-1",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep 2,3,4 months -Inconsistent spatial distribution through time,
but should be caught in standardization step -Seemingly consistent
richness -Seemingly consistent number of hauls
nigfs_1_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-1" & month %in% c(2,3,4), haul_id])
Spring Northern Ireland (quarter 1)
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NIGFS-4",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NIGFS-4",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NIGFS-4",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NIGFS-4",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-4",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-4",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NIGFS-4",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NIGFS-4",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Keep 10,11 months -Consistent spatial distribution through time, but
should be caught in standardization step -Seemingly consistent richness
-Seemingly consistent number of hauls
nigfs_4_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-4" & month %in% c(10,11), haul_id])
####Nor-BTS
OG FISHGLOB includes Nor-BTS-1 as well, but this was not shared by L.
Pecuchet, and therefore ignored
Nor-BTS-3
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "Nor-BTS-3",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "Nor-BTS-3",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "Nor-BTS-3",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "Nor-BTS-3",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "Nor-BTS-3",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "Nor-BTS-3",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "Nor-BTS-3",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "Nor-BTS-3",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 8,9,10 -Somewhat consistent spatial distribution through
time -Number of hauls is variable, but no clear years to exclude
-Laurene Pecuchet (U Tromso) told us that only surveys 2004 and onwards
work for biodiversity analyses
nor_bts_3_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "Nor-BTS-3" & month %in% c(8:10) & year >= 2004, haul_id])
####NS-IBTS
NS-IBTS-1
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NS-IBTS-1",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NS-IBTS-1",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NS-IBTS-1",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NS-IBTS-1",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-1",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-1",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NS-IBTS-1",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NS-IBTS-1",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 1,2,3 -Consistent spatial distribution through time
-Linear increase in richness, cutoff on # hauls more clear -Linear
increase, but somewhat clear break between late 70s and mid-80s, only
keep hauls after 1984
ns_ibts_1_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-1" & month %in% c(1:3) & year >= 1984, haul_id])
NS-IBTS-3
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NS-IBTS-3",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NS-IBTS-3",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NS-IBTS-3",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NS-IBTS-3",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-3",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-3",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NS-IBTS-3",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NS-IBTS-3",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 7,8,9 -Consistent spatial distribution through time
-Consistent richness through time -Early years lower # hauls, will start
at 1998
ns_ibts_3_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-3" & month %in% c(7:9) & year >= 1998, haul_id])
####NZ
NZ-CHAT
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-CHAT",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-CHAT",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-CHAT",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-CHAT",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-CHAT",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-CHAT",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-CHAT",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-CHAT",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 12,1,2 (NOTE THAT THIS NZ-CHAT SURVEY CROSSES YEAR, SO WE
ALREADY LUMPED 12 with NEXT year) -Consistent spatial distribution
through time -Seemingly consistent richness -Seemingly consistent number
of hauls after 1995
nz_chat_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-CHAT" & month %in% c(12,1,2) & year >= 1995, haul_id])
NZ-ECSI
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-ECSI",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-ECSI",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-ECSI",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-ECSI",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-ECSI",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-ECSI",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-ECSI",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-ECSI",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 4,5,6 -Consistent spatial distribution through time
-Seemingly consistent richness -Seemingly consistent number of hauls
-Gap between 1995 and 2005, but we have 10 total years so weāll keep for
now
nz_ecsi_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-ECSI" & month %in% c(4,5,6), haul_id])
NZ-SUBA
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-SUBA",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-SUBA",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-SUBA",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-SUBA",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-SUBA",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-SUBA",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-SUBA",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-SUBA",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 11 and 12 -Consistent spatial distribution through time
-Seemingly consistent richness -Far more hauls in 1990s, these early
sampling years will be excluded (start in 2000)
nz_suba_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-SUBA" & month %in% c(11,12) & year >= 2000, haul_id])
NZ-WCSI
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-WCSI",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-WCSI",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-WCSI",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-WCSI",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-WCSI",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-WCSI",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-WCSI",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-WCSI",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 3,4 -Consistent spatial distribution through time
-Seemingly consistent richness -Linear decrease in # of hauls through
time, leave out first two years with highest # hauls (>= 1995)
nz_wcsi_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-WCSI" & month %in% c(3,4) & year >= 1995, haul_id])
####PT-IBTS PT-IBTS
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "PT-IBTS",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "PT-IBTS",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "PT-IBTS",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "PT-IBTS",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "PT-IBTS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "PT-IBTS",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "PT-IBTS",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "PT-IBTS",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 9,10,11 -Consistent spatial distribution through time
-Seemingly consistent richness -Seemingly consistent number of hauls
pt_ibts_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "PT-IBTS" & month %in% c(9,10,11), haul_id])
####ROCKALL
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ROCKALL",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ROCKALL",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ROCKALL",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ROCKALL",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ROCKALL",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ROCKALL",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ROCKALL",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ROCKALL",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 8,9 -Consistent spatial distribution through time
-Seemingly consistent richness -Seemingly consistent number of hauls
rockall_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ROCKALL" & month %in% c(8,9), haul_id])
####S-GEORG
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "S-GEORG",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "S-GEORG",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "S-GEORG",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "S-GEORG",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "S-GEORG",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "S-GEORG",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "S-GEORG",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "S-GEORG",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 1 and 2 -Consistent spatial distribution through time
-Seemingly consistent richness except for 2003, will be excluded
-Seemingly consistent number of hauls, except for 2012, will be
excluded
s_georg_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "S-GEORG" & month %in% c(1,2) & !(year %in% c(2003,2012)), haul_id])
####SCS
Spring
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SCS-SPRING",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SCS-SPRING",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SCS-SPRING",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SCS-SPRING",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SPRING",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SPRING",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SCS-SPRING",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SCS-SPRING",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 2,3,4 -Inconsistent spatial distribution through time
(northern latitudes only sampled in early years), only include
longitudes < -62 and latitudes < 45.5 -Seemingly consistent
richness -Number of hauls is variable, exclude super low and high
numbers (1985,1994,2015,2019)
scs_spring_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SCS-SPRING" & month %in% c(2,3,4) & !(year %in% c(1985,1994,2015,2019)) & longitude_adj < -62 & latitude < 45.5, haul_id])
SUMMER
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SCS-SUMMER",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SCS-SUMMER",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SCS-SUMMER",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SCS-SUMMER",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SUMMER",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SUMMER",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SCS-SUMMER",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SCS-SUMMER",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 6,7,8 -Consistent spatial distribution through time
-Richness increases linearly, not a clear break point, using breakpoint
from # of hauls, but will exclude 2010 which has a very high richness -#
Hauls increases linearly from ~120 in 1970 to ~220 in 2020, not a clear
breakpoint, but will go with 1986 because there is a jump between 85 and
86 -Gear change in 1983 (Ellingsen et al.Ā 2015)
scs_summer_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SCS-SUMMER" & month %in% c(6,7,8) & year >= 1986 & year != 2010, haul_id])
###SEUS
Spring
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SEUS-spring",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SEUS-spring",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SEUS-spring",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SEUS-spring",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-spring",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-spring",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SEUS-spring",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SEUS-spring",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 4,5,6 -Consistent spatial distribution through time
-Consistent richness through time -# Hauls low in 1989 and 2018, will
exclude
seus_spring_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SEUS-spring" & month %in% c(4,5,6) & year != 1989 & year != 2018, haul_id])
Summer
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SEUS-summer",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SEUS-summer",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SEUS-summer",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SEUS-summer",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-summer",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-summer",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SEUS-summer",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SEUS-summer",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 7,8 -Consistent spatial distribution through time
-Richness consistent through time -# Hauls low in first year, otherwise
okay, just exclude first year (1989)
seus_summer_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SEUS-summer" & month %in% c(7,8) & year != 1989, haul_id])
Fall
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SEUS-fall",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SEUS-fall",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SEUS-fall",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SEUS-fall",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-fall",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-fall",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SEUS-fall",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SEUS-fall",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 9,10,11 -Consistent spatial distribution through time
-Richness consistent through time -# Hauls low in first year, otherwise
okay, just exclude first year (1989)
seus_fall_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SEUS-fall" & month %in% c(9,10,11) & year != 1989, haul_id])
####SWC-IBTS
Scotland Shelf Sea
SWC-IBTS 1
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SWC-IBTS-1",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SWC-IBTS-1",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SWC-IBTS-1",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SWC-IBTS-1",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-1",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-1",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SWC-IBTS-1",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SWC-IBTS-1",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 1,2,3 -Somewhat inconsistent spatial distribution through
time, but this should be addressed in spatial standardization procedure
-Richness consistent through time -# Hauls consistent except low in
1995, just exclude 1995
swc_ibts_1_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-1" & month %in% c(1,2,3) & year != 1995, haul_id])
SWC-IBTS 4
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SWC-IBTS-4",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SWC-IBTS-4",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SWC-IBTS-4",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SWC-IBTS-4",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-4",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-4",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SWC-IBTS-4",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SWC-IBTS-4",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Use months 10,11,12 -Somewhat inconsistent spatial distribution
through time (southern latitudes only sampled in early years), but this
should be addressed in spatial standardization procedure -Richness
consistent through time (especially after mid 90s) -# Hauls consistent
except low before 1995 and low in 2013, exclude these
swc_ibts_4_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-4" & month %in% c(10,11,12) & year != 1995 & year >= 1995, haul_id])
####WCANN
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "WCANN",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "WCANN",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "WCANN",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "WCANN",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "WCANN",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "WCANN",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "WCANN",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "WCANN",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Here, one exception, will use four months (6,7,8,9) because all
sampled consistently, and lower latitude areas sampled later in the
summer consistently -Consistent spatial distribution through time
-Richness consistent through time -# Hauls consistent through time
wcann_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "WCANN" & month %in% c(6:9), haul_id])
####WCTRI -Exclude because only 10 years and overlaps somewhat wiith
WCANN
wctri_keep <- NULL
####ZAF
ATL
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ZAF-ATL",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ZAF-ATL",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ZAF-ATL",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ZAF-ATL",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-ATL",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-ATL",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ZAF-ATL",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ZAF-ATL",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Include 1,2,3 -Consistent spatial distribution through time
-Richness consistent through time -# Hauls consistent through time after
1991
zaf_atl_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ZAF-ATL" & month %in% c(1:3) & year >= 1991, haul_id])
IND
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ZAF-IND",]) +
geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ZAF-IND",]) +
geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
scale_fill_viridis() +
labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ZAF-IND",]) +
geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ZAF-IND",]) +
geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
scale_fill_viridis() +
labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-IND",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
scale_color_viridis() +
theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-IND",]) +
geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
scale_color_viridis(option = "plasma") +
theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ZAF-IND",], aes(x = year, y=spp_count_survey_year)) +
geom_col() +
theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ZAF-IND",], aes(x = year, y=haulid_count_survey_year)) +
geom_col() +
theme_classic()

-Include 4,5,6 -Consistent spatial distribution through time
-Richness consistent through time -# Hauls consistent before 2001, and
then also in 2005 and 2009-2010
zaf_ind_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ZAF-IND" & month %in% c(4:6) & year %in% c(1985:2001,2005, 2009,2010), haul_id])
####Combine all lists that have _keep
#all objects with _keep
list_obj <- ls(pattern = "_keep")
#combine
fishglob_haulids_to_keep <- unlist(lapply(list_obj, get)) #229894 hauls (Started with 278405)
FishGlob.10year.spp_manualclean <- FishGlob.10year.spp[haul_id %in% fishglob_haulids_to_keep,]
#Require latitude and longitude for all observations
FishGlob.10year.spp_manualclean <- FishGlob.10year.spp_manualclean[complete.cases(FishGlob.10year.spp_manualclean[,.(latitude, longitude)])] #check that this works
#another check for # years sampled
#new row for total number of years sampled
FishGlob.10year.spp_manualclean[,years_sampled := length(unique(year)),.(survey_unit)]
View(unique(FishGlob.10year.spp_manualclean[,.(survey_unit, years_sampled)]))
#save
saveRDS(FishGlob.10year.spp_manualclean, file = here::here("data","cleaned","FishGlob.10year.spp_manualclean.rds"))
####Some surveys sample through end of year, fix these -NOTE THAT
THIS NZ-CHAT SURVEY CROSSES YEAR, SO LUMP 1 and 2 with previous year
---
title: "Prepare FishGlob Dataset"
output: html_notebook
author: Zoë J. Kitchel
date: October 11, 2023
---

Script 1 for Kitchel et al. 2023 in prep taxonomic diversity manuscript.


```{r setup}
library(tidyverse)
library(sp)
library(raster)
#library(rgeos)
library(rgbif)
library(viridis)
library(gridExtra)
library(rasterVis)
library(concaveman)
library(sf)
library(cowplot)
library(data.table)
set.seed(1)

```

Pull in compiled and cleaned data from FishGlob downloaded on November 28 2022 (V 1.5). This is typically compiled by Dr. Aurore Maureaud. This includes public and private data and therefore link cannot be shared. However with editing you can run analyses for public trawl surveys.

|Survey code|Survey name short|Survey name long|Agency|Region|Access|Provider/link to access|Inclusion
|-----------|-----------|----------|-----------|-----------|----------|----------|----------|
|AI  |Aleutian Islands|Aleutian Islands|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|BITS-1  |Baltic Sea Q1|Baltic Sea Quarter 1|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|BITS-4  |Baltic Sea Q4|Baltic Sea Quarter 4|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|CHL  |Chile|Chile|Universidad de Concepción, Chile|South America|Requires data request|Daniela Yepson daniela.yepsen@gmail.com and Luis Cubillos lucubillos@gmail.com|Included|
|COL| Colombia| Colombian Caribbean|Universidad Nacional de Colombia|South America|Requires data request|Camilo B. Garcia cbgarciar@unal.edu.co|Too few years|
|DFO-HS  |Hecate Strait|Hecate Strait|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/780a1c02-1f9c-4994-bc70-a0e9ef8e3968 and OceanAdapt: https://zenodo.org/records/8103080|Too few years|
|DFO-NF  |Newfoundland|Newfoundland|Department of Fisheries and Oceans| Canada|Requires data request|Mariano Koen-Alonso mariano.koen-alonso@dfo-mpo.gc.ca|Included|
|DFO-QCS  |Queen Charlotte Sound|Queen Charlotte Sound|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/a278d1af-d567-4964-a109-ae1e84cbd24a and OceanAdapt: https://zenodo.org/records/8103080|Included|
|DFO-SOG  |Strait of Georgia|Straight of Georgia|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/d880ba18-8790-41a2-bf73-e9247380759b and OceanAdapt: https://zenodo.org/records/8103080| Too few years|
|DFO-WCHG  |West Coast Haida Gwaii|West Coast Haida Gwaii|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/5ee30758-b1d6-49fe-8c4e-5136f4b39ad1 and OceanAdapt: https://zenodo.org/records/8103080| Too few years|
|DFO-WCVI  |West Coast Vancouver Island|West Coast Vancouver Island|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/557e42ae-06fe-426d-8242-c3107670b1de and OceanAdapt: https://zenodo.org/records/8103080| Too few years|
|EBS  |Eastern Bering Sea|Eastern Bering Sea|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|EVHOE  |Bay of Biscay|Bay of Biscay|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|FALK  |Falkland Islands|Falkland Islands|Falkland Islands Fisheries Department|Southern Ocean|Requires data request|Alexander Arkhipkin aarkhipkin@fisheries.gov.fk and Jorge Ramos jeramos@fisheries.gov.fk| Excluded after spatial temporal standardization in next script|
|FR-CGFS  |English Channel|English Channel|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|GIN  |Guinea|Guinea|National Center of Fisheries Sciences of Boussoura, Conakry, Republic of Guinea|Africa|Requires data request|Mohammed Lamine Camara mlcamara.kennedy@gmail.com|Inconsistent sampling through space and time|
|GMEX-Summer  |Gulf of Mexico Summer|Gulf of Mexico Summer|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|GMEX-Fall  |Gulf of Mexico Fall|Gulf of Mexico Fall|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|GOA  |Gulf of Alaska|Gulf of Alaska|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|GRL-DE  |Greenland|Greenland|Thuenen Institute of Sea Fisheries|Europe|Requires data request|Karl-Michael Werner karl-michael.werner@thuenen.de|Included|
|GSL-N  |N Gulf of St. Lawrence|Northern Gulf of St. Lawrence|Department of Fisheries and Oceans|Canada|Public|See OceanAdapt: https://zenodo.org/records/8103080 for specific DFO links|Included|
|GSL-S  |S Gulf of St. Lawrence|Southern Gulf of St. Lawrence|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/1989de32-bc5d-c696-879c-54d422438e64 and OceanAdapt: https://zenodo.org/records/8103080|Included|
|ICE-GFS  |Iceland|Iceland|Marine and Freshwater Research Institute, Iceland|Europe|Requires data request|Jón Sólmundsson jon.solmundsson@hafogvatn.is|Included|
|IE-IGFS  |Irish Sea|Irish Sea|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|IS-TAU| Israel | Israel| Tel Aviv University|Asia|Requires data request| Jonathan Belmaker jonathan.belmaker@gmail.com|Too few years|
|IS-MOAG|Israel|Israel|Israeli Ministry of Agriculture|Asia|Requires data request|Oren Sonin orens@moag.gov.il and Dori Edelist blackreefs@gmail.com|Inconsistent sampling through space and time|
|MEDITS  |Mediterranean|Mediterranean|Multiple|Europe|Requires data request|Contact corresponding author for contacts|Included|
|MRT|Mauritania|Mauritania|Institut Mauritanien de Recherches Océanographiques et des Pêches, Nouadhibou, Mauritania|Africa|Requires data request|Beyah Meissa bmouldhabib@gmail.com|Inconsistent sampling through space and time|
|NAM  |Namibia|Namibia|National Marine Information and Research Centre, Ministry of Fisheries and Marine Resources, Namibia|Africa|Requires data request|Johannes Kathena john.kathena@mfmr.gov.na|Included|
|NEUS-Fall  |NE US Fall|Northeast USA Fall|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|NEUS-Spring  |NE US Spring|Northeast USA Spring|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|NIGFS-1  |N Ireland Q1|North Ireland Quarter 1|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|NIGFS-4  |N Ireland Q4|North Ireland Quarter 4|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|Nor-BTS-3  |Barents Sea Norway Q3|Barents Sea Norway Q3|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|NS-IBTS-1  |N Sea Q1|North Sea Quarter 1|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|NS-IBTS-3  |N Sea Q3|North Sea Quarter 3|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|NZ-CHAT  |Chatham Rise NZ|Chatham Rise New Zealand|National Institute of Water and Atmospheric Research Limited, New Zealand| Oceania| Requires data request|Richard O'Driscoll richard.odriscoll@niwa.co.nz and Fabrice Stephenson fabrice.stephenson@waikato.ac.nz|Included|
|NZ-ECSI  |E Coast S Island NZ|East Coast South Island New Zealand|National Institute of Water and Atmospheric Research Limited, New Zealand| Oceania| Requires data request|Richard O'Driscoll richard.odriscoll@niwa.co.nz and Fabrice Stephenson fabrice.stephenson@waikato.ac.nz|Included|
|NZ-SUBA  |Sub-Antarctic NZ|Sub-Antarctic New Zealand|National Institute of Water and Atmospheric Research Limited, New Zealand| Oceania| Requires data request|Richard O'Driscoll richard.odriscoll@niwa.co.nz and Fabrice Stephenson fabrice.stephenson@waikato.ac.nz|Included|
|NZ-WCSI  |W Coast S Island NZ|West Coast South Island New Zealand|National Institute of Water and Atmospheric Research Limited, New Zealand| Oceania| Requires data request|Richard O'Driscoll richard.odriscoll@niwa.co.nz and Fabrice Stephenson fabrice.stephenson@waikato.ac.nz|Included|
|PT-IBTS  |Portugal|Portugal|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|ROCKALL  |Rockall Plateau|Rockall Plateau|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|S-GEORG  |S Georgia|South Georgia|British Antarctic Survey|Southern Ocean|Requires data request|Mark Belchier mark.belchier@gov.gs and Martin Collins macol@bas.ac.uk|Included|
|SCS-Fall  |Scotian Shelf Fall|Scotian Shelf Summer|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/1366e1f1-e2c8-4905-89ae-e10f1be0a164 and OceanAdapt: https://zenodo.org/records/8103080|Too few years|Included|
|SCS-SPRING  |Scotian Shelf Spring|Scotian Shelf Spring|Department of Fisheries and Oceans| Canada|Public|https://open.canada.ca/data/en/dataset/fecf045a-95a2-4b69-8a40-818649a62716 and OceanAdapt: https://zenodo.org/records/8103080|Too much data loss after spatial temporal standardization|
|SCS-SUMMER  |Scotian Shelf Summer|Scotian Shelf Summer|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/1366e1f1-e2c8-4905-89ae-e10f1be0a164 and OceanAdapt: https://zenodo.org/records/8103080|Included|
|SEUS-fall  |SE US Fall|Southeast USA Fall|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|SEUS-spring  |SE US Spring|Southeast USA Spring|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|SEUS-summer  |SE US Summer|Southeast USA Summer|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|SWC-IBTS-1  |Scotland Shelf Sea Q1|Scotland Shelf Sea Quarter 1|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|SWC-IBTS-4  |Scotland Shelf Sea Q4|Scotland Shelf Sea Quarter 4|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|WBLS| Western Black Sea| Western Black Sea|Institute of Fish Resources, Bulgaria|Europe|Requires data request|Elitsa Petrova (elitssa@yahoo.com), Feriha Tserkova & Vesselina Mihneva| Too few years|
|WCANN  |W Coast US|West Coast USA|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|ZAF-ATL  |Atlantic Ocean ZA|Atlantic Ocean South Africa|Department of Forestry, Fisheries and the Environment, South Africa|Africa|Requires data request| Tracey Fairweather traceyf@daff.gov.za|Included|
|ZAF-IND  |Indian Ocean ZA|Indian Ocean South Africa|Department of Forestry, Fisheries and the Environment, South Africa|Africa|Requires data request| Tracey Fairweather traceyf@daff.gov.za|Included|




```{r pull in fishglob database}

FishGlob_1.5 <- fread(here::here("data","FISHGLOB_v1.5_clean.csv"))

```

This version of FishGlob leaves out seasons for GMEX, fix here

```{r add season to GMEX}
#add season to GMEX to survey unit

FishGlob_1.5[survey == "GMEX", survey_unit := paste0(survey,"-",season)]
```

Also adding in seasons for NIGFS

```{r add season to NIGFS}
#add season to GMEX to survey unit

FishGlob_1.5[survey == "NIGFS", survey_unit := paste0(survey,"-",quarter)]
```

ZAF (South Africa) has distinct Atlantic and Indian surveys (split  at ~20.01˚ E, Cape Agulhas)

```{r add longitudinal region to ZAF}
FishGlob_1.5[survey == "ZAF" & longitude <20.01, survey_unit := "ZAF-ATL"][survey == "ZAF" & longitude >= 20.01, survey_unit := "ZAF-IND"]
```

Region names
```{r}
sort(unique(FishGlob_1.5[,survey_unit]))
```
##Data Replacements
####Greenland (version in FishGlob 1.5 is missing lengths and therefore biomass values)
This version was obtained directly from Karl-Michael Werner [karl-michael.werner@thuenen.de](karl-michael.werner@thuenen.de) who now manages the Greenland survey September 2023. He is based in Germany.

```{r}
#greenland <- 

```

####Norway
Prepped by Laurene Pecuchet (U Trömso, Norway) September 2023 to replace what's in FishGlob 1.5 because IMR "are quite concerned that FishGlob, and other studies, have been using a "flawed" multi-surveys dataset that is available in NMDC (data portal of IMR). Turns out that this dataset was put publicly by miscommunication on NMDC after one published paper in Scientific Reports, and I think they only realized the existence of this dataset just the last year as some papers are coming out using it (especially the one from Cesc Gordo-Vilaseca in PNAS https://www.pnas.org/doi/10.1073/pnas.2120869120). They are now trying to make some damage controls to make sure that this dataset is not used ever again in the future, but that cleanded and standardised datasets of the Barents Sea survey that are publicly available in NMDC are used instead of.

September 14: From Laurene, "I send you in attachment the “new” IMR survey formatted for Fishglob. I have done some small check of the dataset, and so far everything looks good, but I didn’t do a deep check yet, but I don’t see why there should be any problems with it....For your study, I think it is also important that you know that there has been some inconsistencies in taxonomic descriptions in the Barents Sea so that some species should be considered at the genus level instead of for biodiversity analysis, I send you in attach an excel (Barents Sea Fish Reference List.csv) file that summarize which species might be a misidentification and which one should be considered and merged." All of these files now live in "data/Norway_Sep2023"

Helpful guidance from here: https://www.hi.no/en/hi/nettrapporter/rapport-fra-havforskningen-en-2021-15
- "2.2.5 - Recommended adjustments to the output before analysis
Eelpouts and liparids. When combing years, we recommend that all records of eelpouts (Zoarcidae) are pooled to the family level, because they are notoriously difficult to identify (see Appendix 3). The same apply to liparids (Liparidae). If species level data of these families are used, consider excluding data from 2004-2006/2007. These years the staff on some of the Norwegian vessels were inexperienced, and proper identification keys for arctic species were lacking (compare for instance catches of Lycodes frigidus and Lycodes eudipleurostictus in the first years to the later years, Appendix 3). If species level data of these families are used, records to family levels should be removed or else these will be treated as a separate species in the further analysis of the data. Both Zoarcidae and Liparidae have unresolved taxonomy for some genera, therefore we have chosen to pool all liparids of the genus Careproctus and all eelpouts of the genus Gymnelus in the output. Sebastes. The column " Sebastes spp." contains mainly juvenile redfish. Small specimens are very difficult to identify so the protocol is to identify only individuals larger than 10 cm to the species level. Before analysis, all redfish ( S . mentella , S. norvegicus, S. viviparus and Sebastes spp .) should be pooled, or Sebastes spp. should be removed – if not it will be treated as a separate species in the analysis . Records in Appendix 2. The records of the S. viviparus west of Svalbard(Spitsbergen) are unreliable and should be removed if Sebastes data are kept at the species level (Appendix 2). Species verified for the Barents Sea, but outliers in terms the normal depth range, distribution area within the Barents Sea, size etc. were coded as questionable in the data base (Appendix 2) and should be removed before analysis. Consider also removing pelagic species (e.g. capelin and herring), as these are poorly sampled by the bottom trawl. The data should be standardised with towing distance before analysis."

Therefore, we will:
- Remove all records of eelpouts and liparids (Family = Zoarcidae or Liparidae) (as we only include species ID'd to species)
- Remove redfish (Genus = Sebastes)

```{r norway data}

#load Norwegian data
load(here::here("data","Norway_Sep2023","NOR-BTS_clean.RData"))
norway_clean <- data.table(data)

#remove observations without dates
norway_clean <- norway_clean[complete.cases(norway_clean[,.(month)]),]

#remove species records in accordance with recommendation from HI
norway_clean <- norway_clean[!(family %in% c("Zoarcidae","Liparidae") | genus == "Sebastes"),]

#some column names don't match fishglob (fishglob = num, num_h, num_cpue, wgt, wgt_h, wgt_cpue; norway = num, num_cpue (number of ind./hour), num_cpua (number of ind./km2), wgt, wgt_cpue (kg/min), wgt_cpua(kg/km2)  )
#also, some column units in the readme are in correct. Therefore, I will generate _cpue and _h values here
# we will need to check  and rename columns
setnames(norway_clean, c("haul_dur"), c("haul_dur_m"))
norway_clean[,haul_dur := haul_dur_m/60] #haul duration currently in minutes, need hours
norway_clean[,num_h := num/haul_dur][,num_cpue := num/area_swept][,wgt_h := wgt/haul_dur][,wgt_cpue := wgt/area_swept]

#change some columns to numeric
cols = c("month","day")
norway_clean[,(cols) := lapply(.SD,as.numeric),.SDcols = cols]

#also, delete source and timestamp
fishglob_colnames <- colnames(FishGlob_1.5)
norway_clean <- norway_clean[,..fishglob_colnames]

norway_clean[survey == "Nor-BTS" & month %in% c(1:6), survey_unit := "Nor-BTS-1"][survey == "Nor-BTS" & month %in% c(7:12), survey_unit := "Nor-BTS-3"]

#Overlap between IBTS and Nor-BTS surveys below 62˚latitude, so delete all hauls that occur below 62˚latitude
norway_clean <- norway_clean[latitude  >= 62,]

```


Delete Greenland and Norway
```{r}
FishGlob_1.5 <- FishGlob_1.5[!(survey %in% c("Nor-BTS"
                                             #,
                                             #"GRL-DE" #ignore greenland for now...
                                             ))]
```


Add in updated Greenland and Norway data
```{r}
FishGlob_1.5 <-rbind(FishGlob_1.5,norway_clean)
#FishGlob_1.5 <-rbind(FishGlob_1.5,greenland)
```


##Preliminary Data Cuts
###Specific Regional Changes Before Cutting to 10 years only

*GSL*
- North: we have data 1980-2019, but gear changes in 2004/2005, so let's use later portion (more consistent months of sampling; 2005-2019; 15 years) 
- South: we have data 1970-2019, but gear/vessel changes in 1985 and again in 1992, so again let's use later portion (1992-2019; 27 years)
- See [this github issue](https://github.com/AquaAuma/fishglob/issues/72)

```{r GSL fixes}
#identify haul_ids of hauls we should remove from GSL surveys
haul_ids_to_remove_GSL <- unique(FishGlob_1.5[(survey == "GSL-N" & year < 2005)|(survey == "GSL-S" & year < 1992),haul_id])

FishGlob_1.5 <- FishGlob_1.5[!(haul_id %in% haul_ids_to_remove_GSL),] #remove hauls before consistent gear/vessel was used
```

*SGEORG*
- From Martin Collins, "Most surveys were focused on demersal fish on the South Georgia shelf (< 350 m), but surveys in 2003, 2010 and 2019 had some deeper trawls.  The deeper trawls caught very different fish, so are unlikely to be of use to a long-term analysis, but I have left them in."

-Delete all trawls deeper than 350 M
```{r}
haul_ids_to_remove_SGEORG <- unique(FishGlob_1.5[(survey == "SGEORG" & depth >350),haul_id])

FishGlob_1.5 <- FishGlob_1.5[!(haul_id %in% haul_ids_to_remove_SGEORG),] #remove hauls before consistent gear/vessel was used
```

*NZ-CHAT*
-bump december observations to next year because observations occur in 12,1,2
```{r}
#bump observations forward
FishGlob_1.5[survey == "NZ-CHAT" & month == 12,  year := year+1, ]
```


###Because time is an essential component of these analyses, we will get rid of any survey x season combinations that are not sampled for at least 10 years

```{r summary by survey region}
#new row for total number of years sampled
FishGlob_1.5[,years_sampled := length(unique(year)),.(survey_unit)]

summary(FishGlob_1.5$years_sampled) #ranges from 2 (DFO Straight of Georgia) to 57 (Northeast US)
View(unique(FishGlob_1.5[,.(survey_unit, years_sampled)]))

#statistics about full dataset
nrow(FishGlob_1.5) 
length(unique(FishGlob_1.5[,survey])) 
length(unique(FishGlob_1.5[,survey_unit])) 

#remove observations for any regions x season combinations sampled less than 10 times
FishGlob.10year <- FishGlob_1.5[years_sampled >= 10,]

#statistics about reduced 10 year dataset
nrow(FishGlob.10year) 
length(unique(FishGlob.10year[,survey])) 
length(unique(FishGlob.10year[,as.character(survey_unit)])) 

#remove full database
rm(FishGlob_1.5)


```

###For taxonomic analyses, resolution to species is required. Therefore, we will  exclude any observations not resolved to species. 

```{r spp ID only}
#month a number
FishGlob.10year[,month := as.numeric(month)]

FishGlob.10year.spp <- FishGlob.10year[rank %in% c("Species", "Subspecies"),] #3869384 total observations

#remove full species database
rm(FishGlob.10year)

#vector with all survey names
all_survey_units <- sort(unique(FishGlob.10year.spp[,survey_unit]))

#calculate # species per year
FishGlob.10year.spp_survey_year <- unique(FishGlob.10year.spp[,.(survey_unit, year, accepted_name)])

FishGlob.10year.spp_survey_year[,spp_count_survey_year := uniqueN(accepted_name),.(survey_unit, year)]

FishGlob.10year.spp_survey_year.r <-unique(FishGlob.10year.spp_survey_year[,.(survey_unit,  year, spp_count_survey_year)])

nrow(FishGlob.10year.spp_survey_year.r)

#calculate # hauls per year
FishGlob.10year.spp_haulid_year <- unique(FishGlob.10year.spp[,.(survey_unit, year, haul_id)])

FishGlob.10year.spp_haulid_year[,haulid_count_survey_year := uniqueN(haul_id),.(survey_unit, year)]

FishGlob.10year.spp_haulid_year.r <-unique(FishGlob.10year.spp_haulid_year[,.(survey_unit,  year, haulid_count_survey_year)])

nrow(FishGlob.10year.spp_haulid_year.r)

```


##Visually Inspect Distribution of Data Through Time and Space

##Spatial and Temporal Patterns in All Trawl Surveys

Let's look at the number of hauls per year/month and year/quarter and year/season visually

```{r hauls per year, month, quarter}
#unique survey, survey_unit, year, month, quarter, season, haul_id, lat, lon
FishGlob.10year.uniquehauls <- unique(FishGlob.10year.spp[,.(survey, survey_unit, year,month,quarter,season,haul_id, latitude, longitude,haul_dur)])

#add column with adjusted longitude for few surveys that cross dateline (NZ-CHAT and AI)
FishGlob.10year.uniquehauls[,longitude_adj := ifelse((survey_unit %in% c("AI","NZ-CHAT") & longitude > 0),longitude-360,longitude)]

FishGlob.10year.uniquehauls[,haul_counts_per_survey_season_month :=uniqueN(haul_id),.(survey, month, season)][, #count # hauls per survey, season, and month
                     haul_counts_per_survey_quarter_month :=uniqueN(haul_id),.(survey, month, quarter)][,#count # hauls per survey, month, and quarter
                     total_hauls_survey :=uniqueN(haul_id),.(survey)][,#count # hauls per survey in all years
                                                        
              #proportion of hauls for each survey, season, and month divided by total # over all years
                     haul_proportion_survey_season :=haul_counts_per_survey_season_month/total_hauls_survey][,
              #proportion of hauls for each survey, quarter, and month divided by total # over all years
                     haul_proportion_survey_quarter :=haul_counts_per_survey_quarter_month/total_hauls_survey][,
                                                                                                               
                     haul_count_per_survey_year_month :=uniqueN(haul_id),.(year, survey_unit, month)][, #count # hauls per survey unit, year, and month
                     total_hauls_survey_year := uniqueN(haul_id),.(survey_unit,year)][, #count total # hauls per survey unit and year
                     #proportion of hauls for each survey unit and month divided by total # hauls within a survey unit within a year
                     haul_proportion_month_yearly := haul_count_per_survey_year_month/total_hauls_survey_year][, 

                     haul_count_per_survey_year_quarter :=uniqueN(haul_id),.(year, survey_unit, quarter)][, #count # hauls per survey unit, year, and month
                     #proportion of hauls for each survey unit and month divided by total # hauls within a survey unit within a year
                     haul_proportion_quarter_yearly := haul_count_per_survey_year_quarter/total_hauls_survey_year] 

FishGlob.10year.uniquehauls.season <- unique(FishGlob.10year.uniquehauls[,.(survey, survey_unit, month, season, haul_counts_per_survey_season_month,total_hauls_survey, haul_proportion_survey_season)]) #relative sampling by season across all years

FishGlob.10year.uniquehauls.quarter <- unique(FishGlob.10year.uniquehauls[,.(survey,survey_unit , month, quarter, haul_counts_per_survey_quarter_month,total_hauls_survey, haul_proportion_survey_quarter)]) #relative sampling by quarter across all years

FishGlob.10year.uniquehauls.annual.month <- unique(FishGlob.10year.uniquehauls[,.(survey, year, survey_unit, month, haul_count_per_survey_year_month,total_hauls_survey_year,haul_proportion_month_yearly)]) #relative sampling by month within years

FishGlob.10year.uniquehauls.annual.quarter <- unique(FishGlob.10year.uniquehauls[,.(survey, year, survey_unit, quarter, haul_count_per_survey_year_quarter,total_hauls_survey_year,haul_proportion_quarter_yearly)]) #relative sampling by month within years

#how does #hauls vary with season and month?
survey_season_month_hauls <- ggplot(FishGlob.10year.uniquehauls.season) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  facet_wrap(~survey,scales = "free_y") +
  theme_classic()

ggsave(survey_season_month_hauls, filename = "survey_season_month_hauls.pdf",path = here::here("figures","view_data"), height = 5, width = 15, units = "in")

#how does #hauls vary with quarter and month?
survey_quarter_month_hauls <- ggplot(FishGlob.10year.uniquehauls.quarter) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  facet_wrap(~survey,scales = "free_y") +
  theme_classic()

ggsave(survey_quarter_month_hauls, filename = "survey_quarter_month_hauls.pdf",path = here::here("figures","view_data"), height = 5, width = 15, units = "in")

#how does #hauls vary with year and month?
year_survey_month_hauls <- ggplot(FishGlob.10year.uniquehauls.annual.month) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  facet_wrap(~survey_unit,scales = "free_y") +
  theme_classic()

ggsave(year_survey_month_hauls, filename = "year_survey_month_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")
ggsave(year_survey_month_hauls, filename = "year_survey_month_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")

#how does #hauls vary with year and month?
year_survey_quarter_hauls <- ggplot(FishGlob.10year.uniquehauls.annual.quarter) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  facet_wrap(~survey_unit,scales = "free_y") +
  theme_classic()

ggsave(year_survey_quarter_hauls, filename = "year_survey_quarter_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")
ggsave(year_survey_quarter_hauls, filename = "year_survey_quarter_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")
```

Now, let's look at how location of sampling varies by month of sampling and year of sampling 

```{r location by year plots}
location_by_year <- ggplot(FishGlob.10year.uniquehauls) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  facet_wrap(~survey_unit, scales = "free") +
  theme_classic()

ggsave(location_by_year, filename = "location_by_year.pdf",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

ggsave(location_by_year, filename = "location_by_year.jpg",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

ggsave(location_by_year, filename = "location_by_year.eps",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")
```


```{r location by month plots}
(location_by_month <- ggplot(FishGlob.10year.uniquehauls) +
  geom_point(aes(x = longitude_adj, y = latitude, color = as.numeric(month)), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  facet_wrap(~survey_unit, scales = "free") +
  theme_classic())

ggsave(location_by_month, filename = "location_by_month.pdf",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

ggsave(location_by_month, filename = "location_by_month.jpg",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

ggsave(location_by_month, filename = "location_by_month.eps",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")
```


##Region Specific Data Processing

-Fredston et al. 2022 Nature and Batt et al. 2017 Ecology Letters informed North American data processing
-Personal communication with Aurore Maureaud and Laurene Pecuchet re: work by A. Maureaud, L. Pecuchet and R. Frelat and the supplementary material for Maureaud et al. 2019 Proceedings of the Royal Society B: Biological Sciences informed European data processing
-Additional data processing informed by data itself, and by FishGlob pdf summary documents
-limit to max 3 months for each survey unit, representative of a 'season' (exception = West Coast USA where all 4 months sampled consistently)

####"AI"
```{r AI visual}
ggplot(FishGlob.10year.uniquehauls.season[survey == "AI",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey == "AI",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey == "AI",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey == "AI",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey == "AI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey == "AI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "AI",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "AI",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()
```
- Most hauls in 6,7,8
- Seemingly consistent spatial distribution through time
- No dramatic changes in spp richness 
```{r AI processing}
ai_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "AI" & month %in% c(6:8),haul_id])
```


####BITS
(We have two surveys for BITS, quarter 1 and quarter 4)
BITS 1

From Fredston et al. 2023, every year after 2000 has >400 hauls and most of the earlier years are <50 

```{r  BITS1 visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "BITS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "BITS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "BITS-1",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "BITS-1",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "BITS-1",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "BITS-1",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()
```
-Keep both months (2,3)
-Seemingly consistent spatial distribution through time
-Consistent # of species and # hauls after 2000
```{r BITS1 processing}
bits1_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "BITS-1" & month %in% c(2,3) & year > 2000,haul_id])
```

BITS4
```{r  BITS4 visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "BITS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "BITS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "BITS-4",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "BITS-4",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "BITS-4",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "BITS-4",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()
```

-Keep (10,11,12)
-Start in 2000 (starts in 1996, but gap in 1997 and 1998, and 1996 all in December; also spp richness in first survey very low; consistent # of hauls after 2000)
-Seemingly consistent spatial distribution through time

```{r BITS4 processing}
bits4_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "BITS-4" & month %in% c(10:12) & year > 2000,haul_id])
```


####CHL (Chile)

```{r  CHL visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "CHL",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "CHL",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "CHL",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "CHL",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "CHL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "CHL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "CHL",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "CHL",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()
```
-Keep (7,8,9)
-Seemingly consistent spatial distribution through time
-No major changes in spp richness through time
-No major changes in # hauls through time

```{r CHL processing}
chl_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "CHL" & month %in% c(7:9),haul_id])
```



####DFO-NF


```{r  DFO-NF visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "DFO-NF",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "DFO-NF",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "DFO-NF",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "DFO-NF",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-NF",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-NF",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "DFO-NF",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "DFO-NF",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```
-Keep (10,11,12)
-Seemingly consistent spatial distribution through time
-No major changes in spp richness through time
-No major changes in haulid through time

```{r DFO-NF processing}
dfo_nf_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "DFO-NF" & month %in% c(10:12),haul_id])
```


####DFO-QCS

```{r  DFO-QCS visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "DFO-QCS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "DFO-QCS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "DFO-QCS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "DFO-QCS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-QCS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-QCS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "DFO-QCS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "DFO-QCS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()
```
-Keep (7,8)
-Seemingly consistent spatial distribution through time
-No major changes in richness over time
-No major changes in #hauls

```{r DFO-QCS processing}
dfo_qcs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "DFO-QCS" & month %in% c(7,8),haul_id])
```



####EBS

-Sampling years prior to 1984 (data begin in 1982) were excluded from analysis due to large apparent increases in the number of species recorded in the first two years. (Batt et al. 2017)

```{r  EBS visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "EBS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "EBS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "EBS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "EBS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "EBS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "EBS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "EBS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "EBS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()
```

-Keep (6,7,8)
-Seemingly consistent spatial distribution through time
-Per Batt et al. 2017, limit to >= 1984
-No clear changes  in richness through time
-No clear changes in # hauls through time

```{r EBS processing}
ebs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "EBS" & month %in% c(6,7,8) & year >= 1984,haul_id])
```


####EVHOE

```{r  EVHOE visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "EVHOE",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "EVHOE",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "EVHOE",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "EVHOE",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "EVHOE",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "EVHOE",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "EVHOE",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "EVHOE",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep (10,11,12)
-Seemingly consistent spatial distribution through time
-Very low sampling in 2017 (and also low richness), exclude this year

```{r EVHOE processing}
evhoe_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "EVHOE" & month %in% c(10,11,12) & year != 2017 ,haul_id])
```


####FALK (excluded from final dataset)
```{r FALK visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "FALK",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "FALK",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "FALK",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "FALK",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "FALK",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "FALK",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "FALK",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "FALK",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```
-Keep February (2) only from 2004 onward (most consistent sampling)
-Inconsistent spatial distribution through time, but this will be fixed in next step with spatial standardization


```{r FALK processing}
falk_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "FALK" & month %in% c(2) & year >= 2004, haul_id])
```


####FR-CGFS

```{r  FR-CGFS visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "FR-CGFS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "FR-CGFS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "FR-CGFS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "FR-CGFS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "FR-CGFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "FR-CGFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "FR-CGFS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "FR-CGFS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```
-Keep 9,10,11
-Consistent spatial distribution through time
-Seemingly consistent richness through time
-Seeemingly consistent #hauls through time


```{r FR-CGFS processing}
fr_cgfs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "FR-CGFS" & month %in% c(9,10,11), haul_id])
```

####GIN (excluded from final dataset)

```{r  GIN visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GIN",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GIN",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GIN",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GIN",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GIN",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GIN",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GIN",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GIN",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Exclude this region, no consistent sampling through time

```{r GIN processing}
gin_hauls_keep <- NULL
```

####GMEX
-In the Gulf of Mexico, we restricted our analysis to data from 1984 - 2000 (full range  1982-2014); if all years had been used, the number of sites sampled in at least 85% of years  would drop from 39 to 13. (Batt et al. 2017)

GMEX Fall 
```{r  GMEX Fall visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GMEX-Fall",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GMEX-Fall",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GMEX-Fall",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GMEX-Fall",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GMEX-Fall",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GMEX-Fall",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 9,10,11
-Inconsistent spatial distribution through time, will restrict to <-87.5 longitude
-Seemingly consistent richness through time
-Seeemingly consistent #hauls through time


```{r GMEX-Fall processing}
gmex_fall_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Fall" & month %in% c(9,10,11) & longitude_adj < -87.5, haul_id])
```

GMEX Summer
```{r  GMEX Summer visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GMEX-Summer",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GMEX-Summer",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GMEX-Summer",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GMEX-Summer",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Summer",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Summer",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GMEX-Summer",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GMEX-Summer",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep months 5,6,7
-In consistent spatial distribution through time, but this will be fixed in spatial standardization step
-Seemingly consistent richness before 2008 and 2008 onward through time
-Seeemingly consistent #hauls through time
-Jump from 2007 to 2008, when spatial footprint increases, so I will only use data from before 2008

```{r GMEX-Summer processing}
gmex_summer_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Summer" & month %in% c(5,6,7) & year <2008, haul_id])
```

####GOA
```{r GOA visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GOA",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GOA",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GOA",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GOA",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GOA",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GOA",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GOA",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GOA",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep months 6,7,8
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent #hauls through time

```{r GOA processing}
goa_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GOA" & month %in% c(6,7,8), haul_id])
```

####GRL-DE
-From Beukhof et al. 2019, all surveys in October and November
```{r GRL-DE visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GRL-DE",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GRL-DE",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GRL-DE",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GRL-DE",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GRL-DE",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GRL-DE",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GRL-DE",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GRL-DE",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-No months in data set, but according to Beukhof et al. 2019, all sampling in October and November so keep all 
-Consistent spatial distribution through time
-Seemingly consistent richness
-# of hauls drops between 1991 and 1992, and both 1992 and 2017 so limit to years between (1993-2016)

```{r GRL-DE processing}
grl_de_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GRL-DE" & year %in% c(1993:2016), haul_id])
```

####GSL

GSL-N
```{r GSL-N visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GSL-N",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GSL-N",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GSL-N",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GSL-N",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-N",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-N",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GSL-N",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GSL-N",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 6,7,8
-Consistent spatial distribution through time
-Seemingly consistent richness
-# of hauls in 2005 is higher, so start in 2006

```{r GSL-N processing}
gsl_n_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GSL-N" & year > 2005, haul_id])
```

GSL-S
```{r GSL-S visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GSL-S",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GSL-S",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GSL-S",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GSL-S",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-S",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-S",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GSL-S",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GSL-S",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 8,9,10
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls

```{r GSL-S processing}
gsl_s_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GSL-S" & month %in% c(8:10), haul_id])
```

####ICE-GFS

```{r ICE-GFS visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ICE-GFS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ICE-GFS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ICE-GFS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ICE-GFS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ICE-GFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ICE-GFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ICE-GFS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ICE-GFS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 2,3,4
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls

```{r ICE-GFS processing}
ice_gfs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ICE-GFS" & month %in% c(2:4), haul_id])
```

####IE-IGFS

```{r IE-IGFS visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "IE-IGFS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "IE-IGFS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "IE-IGFS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "IE-IGFS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "IE-IGFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "IE-IGFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "IE-IGFS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "IE-IGFS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 10,11,12
-Consistent spatial distribution through time after 2004 (sampled far east in 2003 and 2004)
-Seemingly consistent richness
-Seemingly consistent number of hauls

```{r IE-IGFS processing}
ie_igfs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "IE-IGFS" & month %in% c(10:12) & year  > 2004, haul_id])
```

####IS-MOAG (excluded from final dataset)
```{r IS-MOAG visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "IS-MOAG",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "IS-MOAG",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "IS-MOAG",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "IS-MOAG",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "IS-MOAG",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "IS-MOAG",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "IS-MOAG",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "IS-MOAG",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Sampling too scattered over time, excluding

```{r IS-MOAG processing}
is_moag_hauls_keep <- NULL
```

####MEDITS
```{r MEDITS visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "MEDITS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "MEDITS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "MEDITS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "MEDITS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "MEDITS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "MEDITS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "MEDITS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "MEDITS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep  all surveys in quarter 2
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls

```{r MEDITS processing}
medits_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "MEDITS", haul_id])
```


####MRT (excluded from final dataset)
```{r MRT visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "MRT",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "MRT",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "MRT",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "MRT",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "MRT",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "MRT",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "MRT",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "MRT",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Sampling inconsistent, exclude completely

```{r MRT processing}
mrt_hauls_keep <- NULL
```

####NAM

```{r NAM visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NAM",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NAM",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NAM",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NAM",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NAM",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NAM",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NAM",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NAM",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep surveys in 1 and 2 (most consistently sampled)
-Consistent spatial distribution through time
-Seemingly consistent richness except for 1998 (exclude)
-Seemingly consistent number of hauls except for 1998 (exclude)

```{r NAM processing}
nam_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NAM" & month %in% c(1,2) & year != 1998, haul_id])
```


####NEUS


NEUS Spring
```{r NEUS-Spring visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NEUS-Spring",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NEUS-Spring",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NEUS-Spring",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NEUS-Spring",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Spring",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Spring",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NEUS-Spring",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NEUS-Spring",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 3,4,5 months
-Inconsistent spatial distribution through time, but should be caught in standardization step
-Seemingly consistent richness (especially after 87, should be fixed with standardization step)
-Seemingly consistent number of hauls (especially after 81, should be fixed with standardization step)

```{r NEUS-Spring processing}
#calculate wgt_cpue (km^2 avg from sean Lucey) and wgt_h (all biomass values calibrated to standard pre 2009 30 minute tow)
FishGlob.10year.spp[survey == "NEUS", wgt_h := wgt/0.5][survey == "NEUS", wgt_cpue := wgt/0.0384][survey == "NEUS", num_h := num/0.5][survey == "NEUS", num_cpue := num/0.0384]


#also, for northeast, we are going to delete any hauls before 2009 that are outside of +/- 5 minutes of 30 minutes and 2009 forward that are outside of +/- 5 minutes of 20 minutes
neus_spring_keep <- unique(FishGlob.10year.uniquehauls[((survey_unit == "NEUS-Spring" & month %in% c(3:5) & year < 2009 & (haul_dur > 0.42 & haul_dur < 0.58)) |
                                                        (survey_unit == "NEUS-Spring" & month %in% c(3:5) & year >= 2009 & (haul_dur > 0.25  & haul_dur < 0.42))), haul_id])


```

NEUS Fall

```{r NEUS-Fall visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NEUS-Fall",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NEUS-Fall",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NEUS-Fall",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NEUS-Fall",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NEUS-Fall",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NEUS-Fall",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 9,10,11 months
-Inconsistent spatial distribution through time, but should be caught in standardization step
-Seemingly consistent richness (especially after 84, should be fixed with standardization step)
-Seemingly consistent number of hauls (especially after 85, should be fixed with standardization step)

```{r NEUS-Fall processing}

#also, for northeast, we are going to delete any hauls before 2009 that are outside of +/- 5 minutes of 30 minutes and 2009 forward that are outside of +/- 5 minutes of 20 minutes
neus_fall_keep <- unique(FishGlob.10year.uniquehauls[((survey_unit == "NEUS-Fall" & month %in% c(9,10,11) & year < 2009 & (haul_dur > 0.42 & haul_dur < 0.58)) |
                                                        (survey_unit == "NEUS-Fall" & month %in% c(9,10,11) & year >= 2009 & (haul_dur > 0.25  & haul_dur < 0.42))), haul_id])
```

####NIGFS
Northern Ireland

Spring Northern Ireland (quarter 1)

```{r NIGFS spring visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NIGFS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NIGFS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NIGFS-1",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NIGFS-1",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NIGFS-1",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NIGFS-1",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 2,3,4 months
-Inconsistent spatial distribution through time, but should be caught in standardization step
-Seemingly consistent richness
-Seemingly consistent number of hauls

```{r NIGFS 1 processing}
nigfs_1_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-1" & month %in% c(2,3,4), haul_id])
```


Spring Northern Ireland (quarter 1)

```{r NIGFS fall visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NIGFS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NIGFS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NIGFS-4",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NIGFS-4",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NIGFS-4",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NIGFS-4",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 10,11 months
-Consistent spatial distribution through time, but should be caught in standardization step
-Seemingly consistent richness
-Seemingly consistent number of hauls

```{r NIGFS 4 processing}
nigfs_4_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-4" & month %in% c(10,11), haul_id])
```

####Nor-BTS

OG FISHGLOB includes Nor-BTS-1 as well, but this was not shared by L. Pecuchet, and therefore ignored

Nor-BTS-3
```{r Nor-BTS-3}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "Nor-BTS-3",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "Nor-BTS-3",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "Nor-BTS-3",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "Nor-BTS-3",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "Nor-BTS-3",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "Nor-BTS-3",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "Nor-BTS-3",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "Nor-BTS-3",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 8,9,10
-Somewhat consistent spatial distribution through time
-Number of hauls is variable, but no clear years to exclude
-Laurene Pecuchet (U Tromso) told us that only surveys 2004 and onwards work for biodiversity analyses


```{r Nor-BTS-3 processing}
nor_bts_3_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "Nor-BTS-3" & month %in% c(8:10) & year >= 2004, haul_id])
```

####NS-IBTS

NS-IBTS-1
```{r NS-IBTS-1}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NS-IBTS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NS-IBTS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NS-IBTS-1",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NS-IBTS-1",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NS-IBTS-1",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NS-IBTS-1",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 1,2,3
-Consistent spatial distribution through time
-Linear increase in richness, cutoff on # hauls more clear
-Linear increase, but somewhat clear break between late 70s and mid-80s, only keep hauls after 1984


```{r NS-IBTS-1 processing}
ns_ibts_1_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-1" & month %in% c(1:3) & year >= 1984, haul_id])
```

NS-IBTS-3
```{r NS-IBTS-3}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NS-IBTS-3",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NS-IBTS-3",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NS-IBTS-3",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NS-IBTS-3",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-3",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-3",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NS-IBTS-3",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NS-IBTS-3",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 7,8,9
-Consistent spatial distribution through time
-Consistent richness through time
-Early years lower # hauls, will start at 1998


```{r NS-IBTS-3 processing}
ns_ibts_3_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-3" & month %in% c(7:9) & year >= 1998, haul_id])
```


####NZ

NZ-CHAT

```{r NZ-CHAT}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-CHAT",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-CHAT",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-CHAT",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-CHAT",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-CHAT",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-CHAT",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-CHAT",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-CHAT",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 12,1,2 (NOTE THAT THIS NZ-CHAT SURVEY CROSSES YEAR, SO WE ALREADY LUMPED 12 with NEXT year)
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls after 1995


```{r NZ-CHAT processing}


nz_chat_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-CHAT" & month %in% c(12,1,2) & year >= 1995, haul_id])

```

NZ-ECSI

```{r NZ-ECSI}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-ECSI",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-ECSI",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-ECSI",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-ECSI",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-ECSI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-ECSI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-ECSI",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-ECSI",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 4,5,6
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls
-Gap between 1995 and 2005, but we have 10 total years so we'll keep for now


```{r NZ-ECSI processing}
nz_ecsi_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-ECSI" & month %in% c(4,5,6), haul_id])
```

NZ-SUBA

```{r NZ-SUBA}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-SUBA",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-SUBA",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-SUBA",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-SUBA",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-SUBA",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-SUBA",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-SUBA",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-SUBA",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 11 and 12
-Consistent spatial distribution through time
-Seemingly consistent richness
-Far more hauls in 1990s, these early sampling years will be excluded (start in 2000)


```{r NZ-SUBA processing}
nz_suba_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-SUBA" & month %in% c(11,12) & year >= 2000, haul_id])
```

NZ-WCSI

```{r NZ-WCSI}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-WCSI",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-WCSI",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-WCSI",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-WCSI",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-WCSI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-WCSI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-WCSI",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-WCSI",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 3,4
-Consistent spatial distribution through time
-Seemingly consistent richness
-Linear decrease in # of hauls through time, leave out first two years with highest # hauls (>= 1995)


```{r NZ-WCSI processing}
nz_wcsi_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-WCSI" & month %in% c(3,4) & year >= 1995, haul_id])
```

####PT-IBTS
PT-IBTS
```{r PT-IBTS}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "PT-IBTS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "PT-IBTS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "PT-IBTS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "PT-IBTS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "PT-IBTS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "PT-IBTS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "PT-IBTS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "PT-IBTS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 9,10,11
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls


```{r PT-IBTS processing}
pt_ibts_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "PT-IBTS" & month %in% c(9,10,11), haul_id])
```

####ROCKALL

```{r ROCKALL}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ROCKALL",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ROCKALL",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ROCKALL",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ROCKALL",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ROCKALL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ROCKALL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ROCKALL",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ROCKALL",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 8,9
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls


```{r ROCKALL processing}
rockall_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ROCKALL" & month %in% c(8,9), haul_id])
```

####S-GEORG

```{r S-GEORG}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "S-GEORG",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "S-GEORG",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "S-GEORG",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "S-GEORG",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "S-GEORG",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "S-GEORG",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "S-GEORG",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "S-GEORG",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 1 and 2
-Consistent spatial distribution through time
-Seemingly consistent richness except for 2003, will be excluded
-Seemingly consistent number of hauls, except for 2012, will be excluded


```{r SGeorge processing}
s_georg_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "S-GEORG" & month %in% c(1,2) & !(year %in% c(2003,2012)), haul_id])
```

####SCS

Spring
```{r SCS-SPRING}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SCS-SPRING",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SCS-SPRING",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SCS-SPRING",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SCS-SPRING",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SPRING",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SPRING",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SCS-SPRING",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SCS-SPRING",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 2,3,4
-Inconsistent spatial distribution through time (northern latitudes only sampled in early years), only include longitudes < -62 and latitudes < 45.5
-Seemingly consistent richness
-Number of hauls is variable, exclude super low and high numbers (1985,1994,2015,2019)


```{r scs_spring processing}
scs_spring_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SCS-SPRING" & month %in% c(2,3,4) & !(year %in% c(1985,1994,2015,2019)) & longitude_adj < -62 & latitude < 45.5, haul_id])
```

SUMMER
```{r SCS-SUMMER}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SCS-SUMMER",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SCS-SUMMER",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SCS-SUMMER",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SCS-SUMMER",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SUMMER",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SUMMER",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SCS-SUMMER",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SCS-SUMMER",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 6,7,8
-Consistent spatial distribution through time
-Richness increases linearly, not a clear break point, using breakpoint from # of hauls, but will exclude 2010 which has a very high richness
-# Hauls increases linearly from ~120 in 1970 to ~220 in 2020, not a clear breakpoint, but will go with 1986 because there is a jump between 85 and 86
-Gear change in 1983 (Ellingsen et al. 2015)


```{r scs_summer processing}
scs_summer_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SCS-SUMMER" & month %in% c(6,7,8) & year >= 1986 & year != 2010, haul_id])
```


###SEUS


Spring

```{r SEUS-spring}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SEUS-spring",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SEUS-spring",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SEUS-spring",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SEUS-spring",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-spring",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-spring",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SEUS-spring",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SEUS-spring",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 4,5,6
-Consistent spatial distribution through time
-Consistent richness through time
-# Hauls low in 1989 and 2018, will exclude

```{r seus_spring processing}
seus_spring_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SEUS-spring" & month %in% c(4,5,6) & year != 1989 & year != 2018, haul_id])
```


Summer

```{r SEUS-summer}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SEUS-summer",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SEUS-summer",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SEUS-summer",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SEUS-summer",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-summer",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-summer",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SEUS-summer",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SEUS-summer",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 7,8
-Consistent spatial distribution through time
-Richness consistent through time
-# Hauls low in first year, otherwise okay, just exclude first year (1989)

```{r seus_summer processing}
seus_summer_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SEUS-summer" & month %in% c(7,8) & year != 1989, haul_id])
```


Fall

```{r SEUS-fall}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SEUS-fall",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SEUS-fall",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SEUS-fall",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SEUS-fall",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SEUS-fall",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SEUS-fall",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 9,10,11
-Consistent spatial distribution through time
-Richness consistent through time
-# Hauls low in first year, otherwise okay, just exclude first year (1989)


```{r seus_fall processing}
seus_fall_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SEUS-fall" & month %in% c(9,10,11) & year != 1989, haul_id])
```


####SWC-IBTS

Scotland Shelf Sea

SWC-IBTS 1

```{r SWC-IBTS-1}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SWC-IBTS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SWC-IBTS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SWC-IBTS-1",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SWC-IBTS-1",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SWC-IBTS-1",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SWC-IBTS-1",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 1,2,3
-Somewhat inconsistent spatial distribution through time, but this should be addressed in spatial standardization procedure 
-Richness consistent through time
-# Hauls consistent except low in 1995, just exclude 1995



```{r swc-ibts-1 processing}
swc_ibts_1_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-1" & month %in% c(1,2,3) & year != 1995, haul_id])
```

SWC-IBTS 4

```{r SWC-IBTS-4}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SWC-IBTS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SWC-IBTS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SWC-IBTS-4",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SWC-IBTS-4",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SWC-IBTS-4",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SWC-IBTS-4",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 10,11,12
-Somewhat inconsistent spatial distribution through time (southern latitudes only sampled in early years), but this should be addressed in spatial standardization procedure 
-Richness consistent through time (especially after mid 90s)
-# Hauls consistent except low before 1995 and low in 2013, exclude these


```{r swc-ibts-4 processing}
swc_ibts_4_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-4" & month %in% c(10,11,12) & year != 1995 & year >= 1995, haul_id])
```

####WCANN


```{r WCANN}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "WCANN",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "WCANN",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "WCANN",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "WCANN",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "WCANN",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "WCANN",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "WCANN",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "WCANN",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Here, one exception, will use four months (6,7,8,9) because all sampled consistently, and lower latitude areas sampled later in the summer consistently
-Consistent spatial distribution through time
-Richness consistent through time
-# Hauls consistent through time


```{r wcann processing}
wcann_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "WCANN" & month %in% c(6:9), haul_id])
```

####WCTRI
-Exclude because only 10 years and overlaps somewhat wiith WCANN

```{r wctri processing}
wctri_keep <- NULL
```


####ZAF

ATL
```{r ZAF ATL}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ZAF-ATL",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ZAF-ATL",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ZAF-ATL",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ZAF-ATL",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-ATL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-ATL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ZAF-ATL",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ZAF-ATL",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Include 1,2,3
-Consistent spatial distribution through time
-Richness consistent through time
-# Hauls consistent through time after 1991


```{r zaf atl processing}
zaf_atl_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ZAF-ATL" & month %in% c(1:3) & year >= 1991, haul_id])
```


IND
```{r ZAF IND}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ZAF-IND",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ZAF-IND",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ZAF-IND",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ZAF-IND",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-IND",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-IND",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ZAF-IND",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ZAF-IND",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Include 4,5,6
-Consistent spatial distribution through time
-Richness consistent through time
-# Hauls consistent before 2001, and then also in 2005 and 2009-2010


```{r zaf ind processing}
zaf_ind_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ZAF-IND" & month %in% c(4:6) & year %in% c(1985:2001,2005, 2009,2010), haul_id])
```


####Combine all lists that have _keep
```{r combine lists}
#all objects with _keep
list_obj <- ls(pattern = "_keep")

#combine
fishglob_haulids_to_keep <- unlist(lapply(list_obj, get)) #229894 hauls (Started with 278405)

FishGlob.10year.spp_manualclean <- FishGlob.10year.spp[haul_id %in% fishglob_haulids_to_keep,]

#Require latitude and longitude for all observations
FishGlob.10year.spp_manualclean <- FishGlob.10year.spp_manualclean[complete.cases(FishGlob.10year.spp_manualclean[,.(latitude, longitude)])] #check that this works

#another check for # years sampled
#new row for total number of years sampled
FishGlob.10year.spp_manualclean[,years_sampled := length(unique(year)),.(survey_unit)]
View(unique(FishGlob.10year.spp_manualclean[,.(survey_unit, years_sampled)]))

#save
saveRDS(FishGlob.10year.spp_manualclean, file = here::here("data","cleaned","FishGlob.10year.spp_manualclean.rds"))

```


####Some surveys sample through end of year, fix these
-NOTE THAT THIS NZ-CHAT SURVEY CROSSES YEAR, SO LUMP 1 and 2 with previous year
